Category Archives: internet

Google apps in the real world

Or at least the semi-real world: Wired’s Michael Calore spent a month working (mostly) with Google apps rather than his usual desktop software (on a Mac). I’ve thought of trying this same experiment myself but haven’t yet felt that it is worth the risk.

A few points interested me. First, that he could live with the apps themselves, but ran into interoperability problems. One that surprised me:

One of our copy editors couldn’t open some docs I had exported, so I was forced to copy and paste those articles into Microsoft Word just for her.

I’d have thought the other way (Word to Google) would be more difficult, because of missing features. And Google’s response (product manager Jonathan Rochelle):

It works best when everybody in the group is sharing on the same platform,” he says. “The experience you’d have if you were just sharing stored docs rather than your co-workers asking you to save down to the desktop would be much closer to ‘Wow, this is an incredible product’ instead of ‘Wow, this really stinks’.

Oh, so if we all use Google we will be fine. That’s no better than Microsoft telling us all to use Office. Nevertheless I take the point to some extent – this web collaboration thing really only works if everyone plays. That doesn’t excuse the compatibility issues.

Calore mentions the privacy aspect, but more needs to be said here. I have no problem with internet storage; for example, I’m happy to save stuff to Amazon S3 without worrying that Jeff Bezos will start poking through the data. In fact, I consider S3 more secure than just saving files to a typical Linux box out on the Internet, especially shared hosts. Google troubles me though, because its business model is contextual advertising and you agree to let it mine your data, albeit with self-imposed limitations.

Finally there’s the question of whether the apps themselves are good enough. Again, I’d have liked Calore to have said more about this, though he reports how much he misses drag-and-drop. The impression I get is that the browser-based apps were a bit frustrating, but he says:

Eventually, I learned to accept that the browser had certain performance limitations that I would have to live with in exchange for the convenience of centralized storage and easy access. Rochelle says it’s just the way our brains are wired from decades of using desktop apps.

Calore has missed a point here – you can have centralized storage and easy access without necessarily using browser-based apps. Moving the server to the cloud is spot-on, but the case for solely browser-based apps is weaker. Its main advantage is zero install; but that’s the way desktop apps are going as well.

 

WPF/E is now Silverlight

Microsoft’s Flash alternative has a new name: Silverlight. Undoubtedly a radical shift in naming conventions. Back in 2005, Microsoft renamed Avalon to Windows Presentation Foundation, and I noted that:

These new names seem to be deliberately chosen to be forgettable.

Now we have memorable back. Interesting.

The name may be there, but the product is still in preview; the latest release remains the February CTP, with full release promised before July 2007. The emphasis is on video and vector graphics; there’ s no common language runtime implementation yet. You can write Silverlight code in JavaScript. It’s cross-platform, but currently only supports Windows and Mac; no device support yet. See the faq here for more information; and a useful summary from Tim Sneath who says there is big Silverlight news to come at Mix07.

 

Technorati tags: , , ,

Official performance patch for Outlook 2007

Computerworld has drawn my attention to a new performance patch for Outlook 2007, issued on Friday. Here’s what Microsoft says:

This update fixes a problem in which a calendar item that is marked as private is opened if it is found by using the Search Desktop feature. The update also fixes performance issues that occur when you work with items in a large .pst file or .ost file.

The patch is welcome; there’s no doubting that Outlook 2007 has proved horribly slow for many users. But does it fix the problems? If you read through the comments to earlier postings on this subject you’ll notice that there are actually several performance issues. The main ones I’m aware of:

  1. Slow receive from POP3 mail servers. Sometimes caused by conflicts between Vista’s TCP optimization and certain routers – see comment 27 here for a fix.
  2. Add-ins, for example Dell Media Direct, Acrobat PDFMaker, Microsoft’s Business Contact Manager. See Tools – Trust Center – Add-ins and click Go by the “Com Add-ins” dropdown to manage these.
  3. Desktop search indexing. You can disable this (it’s an add-in) but it is a shame to do so, since it is one of the best new features.
  4. Large local mailbox – could be a standalone .PST (Personal Store), or an .OST (Offline Store) that is kept in synch with Exchange.

The published fix appears to address only the problem with large local mailboxes.

Does it work? I’ve applied it, and it seems to help a bit, though I reckon performance remains worse than Outlook 2003. My hunch is that the issues are too deep-rooted for a quick fix, especially if you keep desktop search enabled. I’ll be interested to see whether the patch fixes another Outlook 2007 annoyance: if you close down Windows while Search is still indexing Outlook, you almost always get a message saying “The data file ‘Mailbox …’ was not closed properly. The file is being checked for problems. Then, of course, you wait and wait.

Is it our fault for having large mailboxes? Here’s a comment from Microsoft’s Jessica Arnold, quoted in the Computerworld article referenced above:

Outlook wasn’t designed to be a file dump, it was meant to be a communications tool,” she said. “There is that fine line, but we don’t necessarily want to optimize the software for people that store their e-mail in the same .PST file for ten years.”

A fair point; yet quick, indexed access to email archives is important to many of us. Archiving to a PST is hazardous, especially since by default Outlook archives to the local machine, not to the server; and in many organizations local documents are not backed up. Running a large mailbox may not be a good solution, but what is better?

Perhaps the answer is Gmail, if you are always online and can cope with the privacy issues. Note the first selling point which Google claims for its service:

Fast search
Use Google search to find the exact message you want, no matter when it was sent or received.

Apparently Google understands that users want to be able to find old messages. Surely a desktop application should be at least as good for finding these, as an internet mailbox that might be thousands of miles away?

Update: I still get “The data file ‘Mailbox …’ was not closed properly.” Not fixed.

See also http://blogs.msdn.com/willkennedy/archive/2007/04/17/outlook-performance-update.aspx where a member of the Outlook team further describes the patch.

 

HTML5 vs XHTML2 vs DoNothing

Simon Willison points to David “liorean” Andersson’s article on HTML5 vs XHTML2. This debate about the evolution of HTML has gotten confusing. In a nutshell, the W3C wanted to fix HTML by making it proper grown-up XML, hence XHTML which was meant to succede HTML 4.0. Unfortunately XHTML never really caught on. One of its inherent problems is nicely put by Andersson:

Among the reasons for this is the draconian error handling of XML. XML parsing will stop at the first error in the document, and that means that any errors will render a page totally unreachable. A document with an XML well formedness error will only display details of the error, but no content. On pages where some of the content is out of the control of XML tools with well-designed handling of different character encodings—where users may comment or post, or where content may come from the outside in the form of trackbacks, ad services, or widgets, for example—there’s always a risk of a well-formedness error. Tag-soup parsing browsers will do their best to display a page, in spite of any errors, but when XML parsing any error, no matter how small, may render your page completely useless.

So nobody took much notice of XHTML; the W3C’s influence declined; and a rival anything-but-Microsoft group called WHATWG commenced work on its own evolution of HTML which it called HTML 5.

In the meantime the W3C eventually realised that XHTML was never going to catch on and announced that it would revive work on HTML. Actually it is still working on XHTML2 in parallel. I suppose the idea, to the extent it has been thought through, is that XHTML will be the correct format for the well-formed Web, and HTML for the ill-formed or tag-soup Web. The new W3C group has its charter here. In contrast to WHATWG, this group includes Microsoft; in fact, Chris Wilson from the IE team is co-chair with Dan Connolly. However, convergence with WHATWG is part of the charter:

The HTML Working Group will actively pursue convergence with WHATWG, encouraging open participation within the bounds of the W3C patent policy and available resources.

In theory then, WHATWG HTML 5 and W3C HTML 5 will be the same thing. Don’t hold your breath though, since according to the FAQ:

When will HTML 5 be finished? Around 15 years or more to reach a W3C recommendation (include estimated schedule).

I suppose the thing will move along and we will see bits of HTML 5 being implemented by the main browsers. But will it make much difference? Although HTML is a broken specification, it is proving sufficient to support AJAX and to host other interesting stuff like Flash/Apollo, WPF and WPF/E, and so on. Do we need HTML 5? It remains an open question. Maybe the existence of a working group where all the browser vendors are talking is reward in itself: it may help to fix the most pressing real-world problem, which is browser inconsistency.

 

Technorati tags: , , ,

Making search better: smarter algorithms, or richer metadata?

Ephraim Schwartz’s article on search fatigue starts with a poke at Microsoft (I did the same a couple of months ago), but goes on to look at the more interesting question of how search results can be improved. Schwartz quotes a librarian called Jeffrey Beall who gives a typical librarian’s answer:

The root cause of search fatigue is a lack of rich metadata and a system that can exploit the metadata.

It’s true up to a point, but I’ll back algorithms over metadata any day. A problem with metadata is that it is never complete and never up-to-date. Another problem is that it has a subjective element: someone somewhere (perhaps the author, perhaps someone else) decided what metadata to apply to a particular piece of content. In consequence, if you rely on the metadata you end up missing important results.

In the early days of the internet, web directories were more important than they are today. Yahoo started out as a directory: sites were listed hierarchically and you drilled down to find what you wanted. Yahoo still has a directory; so does Google; another notable example is dmoz. Directories apply metadata to the web; in fact, they are metadata (data about data).

I used to use directories, until I discovered AltaVista, which as wikipedia says was “the first searchable, full-text database of a large part of the World Wide Web.” AltaVista gave me many more results; many of them were irrelevant, but I could narrow the search by adding or excluding words. I found it quicker and more useful than trawling through directories. I would rather make my own decisions about what is relevant.

The world agreed with me, though it was Google and not AltaVista which reaped the reward. Google searches everything, more or less, but ranks the results using algorithms based on who knows what – incoming links, the past search habits of the user, and a zillion other factors. This has changed the world.

Even so, we can’t shake off the idea that better metadata could further improve search, and therefore improve our whole web experience. Wouldn’t it be nice if we could distinguish synonymns like pipe (plumbing), pipe (smoking) and pipe (programming)? What about microformats, which identify rich data types like contact details? What about tagging – even this post is tagged? Or all the semantic web stuff which has suddenly excited Robert Scoble:

Basically Web pages will no longer be just pages, or posts. They’ll all be split up into little objects, stored in a database (a massive, scalable one at that) and then your words can be displayed in different ways. Imagine a really awesome search engine that could bring back much much more granular stuff than Google can today.

Maybe, but I’m a sceptic. I don’t believe we can ever be sufficiently organized, as a global community, to follow the rules that would make it work. Sure, there is and will be partial success. Metadata has its place, it will always be there. But in the end I don’t think the clock will turn back; I think plain old full-text search combined with smart ranking algorithms will always be more important, to the frustration of librarians everywhere.

 

Infinitely scalable web services

Amazon’s Jeff Barr links to several posts about buiding scalable web services on S3 (web storage) and EC2 (on-demand server instances).

I have not had time to look into the detail of these new initiatives, but the concept is compelling. This is where Amazon’s programmatic approach pays off in a big way. Let me summarise:

1. You have some web application or service. Anything you like. Football results; online store; share dealing; news service; video streaming; you name it.

2. Demand of course fluctuates. When your server gets busy, the application automatically fires up new server instances and performance does not suffer. When demand tails off, the application automatically shuts down server instances, saving you money and making those resources available to other EC2 users.

3. Storage is not an issue; S3 has unlimited expandibility.

This approach makes huge sense. Smart programming replaces brute force hardware investment. I like it a lot.

 

Technorati tags: , ,

MP3 device runs .NET – but in Mono guise

I’ve long been interested in Mono, the open-source implementation of Microsoft .NET. It seems to be maturing; the latest sign is the appearance of an MP3 player using Linux and Mono. Engadget has an extensive review. Miguel de Icaza says on his blog:

The Sansa Connect is running Linux as its operating system, and the whole application stack is built on Mono, running on an ARM processor.

I had not previously considered Mono for embedded systems; yet here it is, and why not?

The device is interesting too. As Engadget says:

… you can get literally any music in Yahoo’s catalog whenever you have a data connection handy

This has to be the future of portable music. It’s nonsense loading up a device with thousands of songs when you can have near-instant access to whatever you like. That said, wi-fi hotspots are not yet sufficiently widespread or cheap for this to work for me; but this model is the one that makes sense, long-term.

I wonder if iPhone/iTunes will end up doing something like this?

Technorati tags: , , ,

Delphi for PHP first impressions

I tried out Delphi for PHP for the first time this weekend.

Install on Vista was smooth. The setup installs its own copy of Apache 2 and PHP 5. A few minutes later and I was up and running.

The IDE is Delphi-like. Here is a scrunched-up image to give you a flavour:

 

I have a standard application I build when trying out a new development tool. It is a to-do list with a listbox, a textbox, and buttons to add and remove items from the list. I started well, and soon had the controls placed, though they are tricky to line-up nicely. I resorted to setting the Left property as the snap-to-grid did not work for me.

Then I double-clicked the Add button. As expected, I was greeted with an empty Click handler. What to type? After a little experimentation I came up with this:

$this->lstItems->AddItem($this->ebItem->Text,null,null);

When you type ->, the editor pops up autocomplete choices. Nice. I clicked the run button and the application opened in my web browser. I set a breakpoint on the line; that worked nicely, especially after I displayed the Locals window so I could see the value of variables.

The next step is to implement removing an item. This is fractionally more challenging (I realise this is little more than Hello World), since I need to retrieve the index of the selected item and then work out how to remove it.

I am embarrassed to admit that it took me some time. Yes, I tried the documentation, but it is terrible. Unbelievably bad. Someone ran a thing called Doc-O-Matic over the code. Here’s the entire description of the ListBox control:

A class to encapsulate a listbox control 

There’s also a reference which lists methods, again with a one-line description if you are lucky. Here’s the one for ListBox.getItems:

This is getItems, a member of class ListBox.

I gave up on the docs. I had figured out AddItem; I had discovered that the itemindex property has the index of the selected item; but there is no RemoveItem or DeleteItem. I went back to basics. The ListBox has an _items member field which is an array. In PHP you remove an item from an array with unset. I resorted to editing the VCL for PHP by adding a RemoveAt method to CustomListBox:

function RemoveAt($index)
{
unset($this->_items[$index]);
}

Note that I am not proposing you do the same. There must be a better way to do this. I just couldn’t work it out quickly from the docs; and I was determined to get this up and running.

Here’s my code for removing an item:

$selindex = $this->lstItems->itemindex;

if ( $selindex > -1)
{
$this->lstItems->RemoveAt($selindex);
}

Now my app worked fine. What about deployment? I used the deployment wizard, which essentially copies a bunch of files into a directory, ready for upload. There are a lot. 44 files to be precise, mostly of course the VCL for PHP. Still, it was painless, and you can configure a web server to share these files between different applications.

All I needed to test it was a web server running PHP 5.x (it will not work with PHP 4). Fortunately I had one available, so I uploaded my first Delphi for PHP application. It looked good, but although it worked on my local machine, the deployed app throws an error when you click a button:

Application raised an exception class Exception with message ‘The Input Filter PHP extension is not setup on this PHP installation, so the contents returned by Input is *not* filtered’

I note that this user has the same problem. My hunch is that Delphi for PHP requires PHP 5.2 – I only have 5.1 at the moment.*

In addition, I don’t like the way the default deployment handles errors, by publishing my callstack to the world, complete with the location of the files on my web server.

How secure are all these VCL for PHP files anyway? What assurance do I have about this? Will they be patched promptly if security issues are discovered?

Important questions.

There will be plenty more to say about Delphi for PHP. For the moment I’m reserving judgment. I will say that the release looks rushed, which is a shame.

Update: I’ve now seen a fix posted to the Borland newsgroups for the input filter exception, showing how to remove the code which raises it. However I suggest you do not apply this fix, for security reasons, unless you are deploying on a trusted intranet. It is vital to sanitize PHP input on the internet.

*PHP 5.2 is not the answer. It could even be a problem. Delphi for PHP ships with PHP 5.1. There is an input filter extension which you can add for PHP 5.x; see http://pecl.php.net/package/filter. However these are built into PHP 5.2; but the version used by VCL for PHP is old and seems to be incompatible. What a mess.

Technorati tags: , , ,

Try Delphi for PHP for one day

Codegear is offering a free trial of Delphi for PHP … for a single day:

Long enough to evaluate a developer product? To my mind this is taking RAD a step too far. Just as well, since, this is what I got when I tried to download it:

This means one of two things. It either demonstrates the huge interest in Delphi for PHP, or the unfortunate lack of scalability in CodeGear’s server applications. Which, it appears, are not coded in PHP.

To be fair, the product has just been slashdotted. The thread is not especially illuminating so far, though I thought this was a telling comment:

For a reference, this is how this looks in plain PHP (granted no MVC and so on, but for the sake of example..):
<?php echo “Hello World” ?>
What does Delphi do?

  1. Loads several thousand lines VCL code
  2. Loads all the menu, form, container and “external” controls, although they’re not used (thousands of lines of code)
  3. The Hello World is a label (no simpler way) which has around 50 properties (color, bg color and what not) defined in an XML file. I left all at defaults, but never mind. The file is loaded, parsed.
  4. The Label class inherits from CustomLabel, which inherits from Components which inherits from other stuff I didn’t even bother check, it goes through all properties, and figures out after a lot of thinking that it should print the words “Hello World”.

Yes, that’s the trade-off with frameworks, though some are better than others. Now we need some counter-examples. Anyone?

 

Technorati tags: , , ,

Delphi for PHP is done

Hot on the heels of Delphi 2007, CodeGear has announced the completion of Delphi for PHP. Apparently download purchasers can buy immediately.

The name is controversial: this product uses neither the Delphi IDE, nor the Delphi language. Rather, it is inspired by Delphi; maybe it was created with it too. I guess it would have used the Delphi IDE had it not been a third-party buy-in; perhaps it will in future.

The associated library, called VCL for PHP, is meant to be open source; but its home page on SourceForge remains empty at the time of writing.

More when I’ve had a chance to try it out; again, I’d be interested in hearing from early adopters.

 

Technorati tags: , , ,