All posts by onlyconnect

Reserved IPs and other Microsoft Azure annoyances

I have been doing a little work with Microsoft’s Azure platform recently. A common requirement is that you want a VM which is internet-accessible with a custom domain, for which the best solution is to create a A record in your DNS pointing to the IP number of the VM. In order to do this reliably, you need to reserve an IP number for the VM; otherwise Azure may assign a different IP number if you shut it down and later restart it. If you keep it running you can keep the IP number, but this also means you are have to pay for the VM continuously.

Azure now offers reserved IP numbers. Useful; but note that you can only link a VM with a reserved IP number when it is created, and to do this you have to create the VM with PowerShell.

What if you want to assign a reserved IP number to an existing VM? One suggestion is that you can capture an image from the VM, and then create a new VM from the image, complete with reserved IP. I went partially down this route but came unstuck because Azure for some reason captured the image into a different region (West Europe) than the region where the VM used to be (North Europe). When I ran the magic PowerShell script, it complained that the image was in the wrong region. I then found a post explaining how to move images between regions, which I did, but the metadata of the moved image was not quite the same and creating a new VM from the image did not work. At this point I realised that it would  be easier to recreate the VM from scratch.

Note that when reserved IP number were announced in May 2014, program manager Mahesh Thiagarajan said:

The platform doesn’t support reserving the IP address of the existing Cloud Services or Virtual machines. We expect to announce support for this in the near future.

You can debate what is meant by “near future” and whether Microsoft has already failed this expectation.

There is another wrinkle here that I am not clear about. Some Azure VMs have special pricing, such as those with SQL Server pre-installed. The special pricing is substantial, often forming the largest part of the price, since it includes licensing fees. What happens to the special pricing if you fiddle with cloning VMs, creating new VMs with existing VHDs, moving VMs between regions, or the like? If the special pricing is somehow lost, how do you restore it so SQL Server (for example) is still properly licensed? I imagine this would mean a call to support. I have not seen any documentation relating to this in posts like this about moving a virtual machine into a virtual network.

And there’s another thing. If you want your VM to be in a virtual network, you have to do that when you create it as well; it is a similar problem.

While I am in complaining mode, here is another. Creating a VM with PowerShell is easy enough, but you do need to know the image name you are using. This is not shown in the friendly portal GUI:

image

In order to get the image names, I ran a PowerShell script that exports the available images to a file. I was surprised how many there are: the resulting output has around 13,500 lines and finding what you want is tedious.

Azure is mostly very good in my experience, but I would like to see these annoyances fixed. I would be interested to hear of other things that make the cloud admin or developer’s life harder than it should be.

So that was 2014: Samsung stumbles, all change for Microsoft, Sony hack, more cloud, more mobile

What happened in 2014? One thing I did not predict is that Samsung lost its momentum. Here are Gartner’s figures for global smartphone sales by vendor, for the third quarter of 2014:

image

Samsung is still huge, of course. But in 2013, Samsung seemed to be in such control of its premium brand that it could shape Android as it wished, rather than being merely an OEM for Google’s operating system. In the enterprise, Samsung KNOX held promise as a way to bring security and manageability to Android, but only in Samsung’s flavour. Today, that seems less likely. Market share is declining, and much of KNOX has been rolled into Android Lollipop. What is going wrong? The difficulty for Samsung is how to differentiate its products sufficiently, to avoid bleeding market share to keenly priced competition from vendors such as Xiaomi and Huawei. This is difficult if you do not control the operating system.

What of the overall mobile OS wars? 2013 brought few surprises: the Apple/Android duopoly continued, Blackberry further diminished its share, and Windows Phone struggles on, though it was not looking good for Microsoft’s OS as 2013 closed; the Nokia acquisition may have been fumbled.

All change at Microsoft

That brings me to Microsoft, a company I watch closely. 2014 saw Satya Nadella appointed as CEO and several strategic changes, though the extent to which Nadella introduced those changes is uncertain. What changes?

Office is going truly cross-platform, with first-class support for iOS and Android. I covered this recently on the Register; the summary is that there will be mobile versions of Office for iOS, Android and Windows (this last a Store app) with similar features, and that more and more of the functionality of desktop Office will turn up in the mobile versions. I learned from my interview with Technical Product Manager Kaberi Chowdhury that ODF (Open Document) support is planned, as is some level of programmability.

The plans for Office are a clue to the company’s wider strategy, which is focused on cloud and server. Key products include Office 365, Windows Azure, Active Directory (and Azure Active Directory), SQL Server, SharePoint, and System Center as a management tool for hybrid cloud.

The Windows client strategy is to bring back users who disliked Windows 8 with a renewed focus on the desktop in the forthcoming Windows 10, while retaining the Store app model for apps that are secure, touch-friendly, and easily deployed. It is still not clear what Windows 10 phones and tablets will look like, but we can expect convergence; no more Windows RT, but perhaps tablets running Windows Phone OS that are in effect the next generation of Windows RT without a desktop personality.

The company will also hedge its bets with full app support for Office and its cloud services on iOS and Android, and in doing so will make its Windows mobile offerings less compelling.

Microsoft’s developer tools are changing in line with this strategy. The next generation of .NET is open source and cross-platform on the server side, for Windows, Mac and Linux. Xamarin plugs the gap for .NET on iOS and Android, while Microsoft is also adding native support (not .NET based) for cross-platform mobile in the next Visual Studio.

These are big changes to the developer stack, and Microsoft is forking .NET between the continuing Windows-only .NET Framework, and the new cross-platform .NET Core. Developers have many questions about this; see this interview on the Register for what I could glean about the current plans. Watch our for the Build conference at the end of April when the company will attempt to put it all together into a coherent whole for developers targeting either Windows 10, or cloud apps, or cloud services with cross-platform mobile clients.

This entire strategy is a logical progression from the company’s failure in mobile. Can it now succeed with client apps running on platforms controlled by its competitors? Alternatively, is there hope that Windows 10 can keep businesses hooked on Windows clients? Maybe 2015 will bring some answers, though with Windows 10 not expected until towards the end of the year there will be a long wait while iOS, Android and even Chrome OS (the operating system of Chromebook) continue to build.

A side effect is that C# now has a better chance of building a cross-platform user base, rather than being a Windows language. This has already happened in game development, thanks to the use of Mono and C# in the popular Unity game engine. Could it also happen with ASP.NET, deployed to Linux servers, now that this will be officially supported? Or is there little room for it alongside Java, PHP, Ruby, Node.js and the rest? 

The puzzle with Microsoft is that there is still too much mediocrity and complacency that damages the company’s offerings. How can it expect to succeed in the crowded wearable market with a band that is uncomfortable to wear? There is still an attitude in some parts of the company that the world will be happy to put up with problems that might be fixed in a future version after some long interval. Then again, the Azure team is doing great things and Windows server continues to impress. Win or lose, there will be plenty of Microsoft news this year.

A theme for 2015: cloud optimization

Late last year I attended Amazon’s re:Invent conference in Las Vegas; I wrote this up here. The key announcement for me was Amazon Aurora, a MySQL clone, not so much because of its merits as a cloud database server, but more because it represents a new breed of applications that are designed for the cloud. If you design database storage with the knowledge that it will only ever run on a huge cloud-scale infrastructure, you can make optimizations that cannot be replicated on smaller systems. I tried to summarize what this means in another Register piece here. The fact that this type of technology can be rented by any of us at commodity prices increases the advantage of public cloud, despite reservations that many still have concerning security and control. It also poses a challenge for companies like Oracle and Microsoft whose technology is designed for on-premises as well as cloud deployment; they cannot achieve the same advantage unless they fork their products, creating cloud variants that use different architecture.

The Sony hack

The cyber invasion of Sony Pictures in late November was not just another hack; it was a comprehensive takedown in which (as far as I can tell) the company’s entire IT systems were entirely compromised and significantly damaged.

According to this report:

Mountains of documents had been stolen, internal data centers had been wiped clean, and 75 percent of the servers had been destroyed.

Most IT admins worry about disaster recovery (what to do after catastrophic system failure such as a fire in your data center) as well as about security (what to do if hackers gain access to sensitive information). In this case, both seemed to happen simultaneously. Further, as producing movies is in effect a digital business, the business suffered loss of some of its actual products, such as the unreleased “Annie”.

The incident is fascinating in itself, especially as we do not know the identity of the hackers or their purpose, but what interests me more are the implications.

Specifically, how many companies are equally at risk? It seems clear that Sony’s security was towards the weak end of the scale, but there is plenty of weak security out there, especially but not exclusively in smaller businesses.

With the outcome of the Sony hack so spectacular, it is likely that there will be similar efforts in 2015, as well as many businesses looking nervously at their own practices and wondering what they can do to protect themselves.

Cloud may be part of the answer though even if the cloud provider does security right, that is no guarantee that their customers do the same.   

Looking back on looking back

Here is what I wrote a year or so ago, Reflecting on 2013- the year of not the PC, no privacy, and the Internet of Things. Most of it still applies. I have not achieved any of the three goals I set for myself though. Maybe this year…

SSD storage has come to Azure VMs, along with faster Azure SQL

Microsoft has introduced SSD storage for Azure VMs. This is a catch-up with Amazon which has been offering this at least since June 2014. It is an important feature though, and now in preview. The SSDs are part of the Azure storage service but can only be used for disks attached to VMs, not for general-purpose block files. There are three virtual disks available:

  P10 P20 P30
Disk size 128GB 512GB 1TB
IOPS 500 2300 5000
Throughput 100 MB/s 150 MB/s 200 MB/s

Price is $6.90 per 100GB per month, which if I am reading this right is less than Amazon’s $0.10 per GB per month ($10 per 100GB) as shown here.

One obvious use case is for SQL Server running on a VM. This generally performs better than Microsoft’s Azure SQL database service. That said, Microsoft is also previewing an improved Azure SQL which supports most of the features of SQL Server 2014, including .NET stored procedures and in-memory columnstore queries. Microsoft’s Scott Guthrie says performance is better:

Our internal benchmark tests (using over 600 million rows of data) show query performance improvements of around 5x with today’s preview relative to our existing Premium Tier SQL Database offering and up to 100x performance improvements when using the new In-memory columnstore technology.

If you can make it work, Azure SQL is better sense than running SQL Server in a VM with all the hassles of server patching and of course Microsoft’s licensing fees; but the performance has to be there. Another factor which drives users to the VM option is that SQL Reporting Service is not available in Azure SQL.

Windows Phone wobbles: why users are losing heart

When Microsoft acquired Nokia in April this year, there was always a risk that the Windows Phone platform would lose momentum (yes there was some momentum).

Nokia was better at marketing, better at hardware innovation, and better at the all-important operator relations than Microsoft itself.

I consider the launch of Windows Phone 7 in October 2010 to be one of Microsoft’s great disasters, not because of the operating system which is very good, but because the company failed to get all the pieces in place at the right time. When the phone was first released in the UK, you could not buy it at all in my local town centre, and even if you could find it, the hardware was indifferent, just slightly tweaked Android handsets from the likes of HTC and Samsung.

The underlying problem was that Microsoft was late to market and the iOS/Android duopoly was already dominant; but even so, the company could have done better.

Nokia’s adoption of Windows Phone in February 2011 (first devices came in October 2011) made a striking difference. Distinctive hardware and better visibility in the high street gave the platform a better chance of success. Nokia Drive for turn-by-turn navigation was a great feature, along with other Nokia apps and services like Mix Radio. Admittedly the Lumia 800 (the launch model) had some issues, especially with battery life and charging problems, but better devices followed.

Nokia also started making cheaper Windows Phones, delivering some of the best value smartphones on any platform.

The Nokia Lumia 1020, released in late summer 2013, brought the best camera in any smartphone to market, thanks to the company’s PureView research along with a high-quality though slightly protruding lens.

image
Nokia Lumia 1020

That was something of a high point for Windows Phone. Nokia, perhaps, started to panic as Windows Phone sales still failed to take off as quickly as had been hoped. At Mobile World Congress in February 2014, it announced Nokia X, a version of Android without Google services. This made no sense to me at the time; but indicated that Nokia had diverted its focus away from improving Windows Phone to chase an alternative (and doomed) platform.

There was also a period of hiatus between Sept 2013 when Microsoft’s acquisition was announced, and April 2014 when it completed. During this time Nokia operated independently, but with the knowledge that the businesses would be merged in due course; not a good scenario for long-term planning.

The problem today is that even those few who have adopted Windows Phone are losing heart. This is not only to do with market dynamics and the app problems over which Microsoft has no control. Check out this monster thread on Microsoft’s forums. There is a hardware issue with some Lumia models (including the 1020) such that the 8.1 update causes the phone to freeze at random intervals; not good if, for example, you have an alarm set or are waiting for a call. What you will see is that users started complaining on September 10th. Nobody from Microsoft bothered to comment on the thread or help users with mitigation suggestions until November 22nd, when Kevin Lee at Microsoft made an appearance:

Beginning in early September we started to receive an increased number of customer feedback regarding Microsoft Lumia 1020 and 925 device freezes. During the last two months we have been reaching out for more and more data and devices to systematically reproduce and narrow down the root cause. It turned out to be a power regulator logic failure where in combination with multiple reasons the device fails to power up the CPU and peripherals after idling into a deep sleep state.

I am pleased to pass on that we have a fix candidate under validation which we expect to push out the soon with the next SW update!

Which update? When? Mr Lee has made no further comment, and phones are still freezing. It is frustrating for users who return phones for repair, have the software reset, and then still suffer the problem, because it is incorrectly diagnosed by the repair engineers (read the thread for many such tales).

While the specific issue affects only a subset of Windows Phone users, this is not only indicative of poor quality control before the 8.1 release was pushed out, but also poor communication with users of the high-end Windows Phone devices; the market where Microsoft is weakest.

More seriously, the 1020 which is now coming up to 18 months old is still in some ways top of the range, certainly for the camera; note that in the Lumia range you have 8x, 9x, and 10x prefixes, and there has been no advance on 1020 in the 10x series.

Another issue for Windows Phone is that Microsoft is putting out Office for iOS and Android while seeming to neglect its own platform. The forthcoming Visual Studio 2015 includes a new set of tools for both native and HTML-based development for iOS and Android. It is beginning to look as if Microsoft itself is now treating the platform as second-class.

Unlike Ed Bott and Tom Warren I still use a 1020 as my main phone. I like the platform and I like not taking a separate camera with me. It was great for taking snaps on holiday in Norway. But I cannot survive professionally with just Windows Phone. It seems now that a majority of gadgets I review come with a supporting app … for iOS or Android.

Microsoft is capable of making sense of Windows Phone, particularly in business, whether it can integrate with Office 365, Active Directory and Azure Active Directory. On the consumer side there is more that could be done to tie with Windows and Xbox. Microsoft is a software company and could do some great first party apps for the platform (where are they?).

The signs today though are not good. Since the acquisition we have had some mid-range device launches but little to excite. The sense now is that we are waiting for Windows 10 and Universal Apps (single projects that target both phone and full Windows) to bring it together. Windows 10 though: launch in the second half of 2015 is a long time to wait. If Windows Phone market share diminishes between then and now, there may not be much left to revive.

Google’s official Android Studio is at version 1.0

Google has released version 1.0 of Android Studio.

image

This Java/Android IDE has been in preview/beta since Google IO in May. It is based on the excellent JetBrains IntelliJ IDEA.

You can get Android Studio here. It is now the official Android IDE and developers using Eclipse are encouraged to migrate – like it or not.

One of the key features is a new build system based on Gradle. Another notable features is a visual layout designer; you can toggle between visual and text modes.

image

Presumably one reason for Google developing its own Android IDE is to integrate more tightly with its cloud services. There is a Google Cloud Module on offer in the IDE.

image

Android development has its hassles. I seem to spend far too much time in the Android SDK Manager downloading new versions of the SDK, which is frequently updated.

image

Another annoyance is that the Intel Emulator Accelerator (HAXM) is incompatible with Hyper-V, the official Windows hypervisor. You either have to uninstall Hyper-V,  or put up with a slow emulator. I would prefer it if Google/Intel/JetBrains used the standard Windows component.

What is .NET Core, “the foundation of all future .NET platforms”?

I have been looking at .NET Core, an official Microsoft open source project which you can find on github and which is at the heart of Microsoft’s plans to open source most of its .NET technology.

Currently there are three Microsoft repositories for the .NET Core platform. There are the .NET Compiler Platform (“Roslyn”), ASP.NET 5, and the .NET Core Framework. Note that these are all v.Next versions of the .NET Framework. ASP.NET 5 and the .NET Core Framework are on github, but Roslyn is on CodePlex, Microsoft’s open source repository site. There is also a github repository for Entity Framework 7, currently part of ASP.NET though I am not sure that it belongs there. The current version of EF is 6.11 but the code for this is on CodePlex. The KRuntime, which is the implementation of the parts of the .NET Runtime needed to host an ASP.NET application, is also in the ASP.NET repository. Its full name is the K Runtime Environment (KRE); I am not sure what K stands for. Note that Microsoft has only promised to open source the .NET server stack, not desktop frameworks like Windows Presentation Foundation.

I had a look at the .NET Core Framework. This is the key set of libraries for .NET applications. The easiest way to build the core libraries is from the command line. Open a Visual Studio 2013 Developer Command Prompt (which sets up the path and environment for command line builds), go to your clone of the github repository and type build.

image

Cool. But what is in it? Not that much: System.Collections, Parallel Linq, Vectors and XML libraries.

“More is coming soon. Stay tuned!” say the docs. And in this blog post by Microsoft’s Immo Landwerth:

Consider the subset we have today a down-payment on what is to come. Our goal is to open source the entire .NET Core library stack by Build 2015.

Landwerth says that Microsoft is “currently figuring out the plan for open sourcing the runtime”; this is the native code that creates the .NET Virtual Machine which executes .NET code.

Of course there is also Mono, the old open source implementation of .NET which is from an independent code base.

This is exciting stuff for .NET developers, especially since official runtimes for Linux and Mac are also promised, but also somewhat confusing. What is .NET Core versus what we have known as the .NET Framework?

Here is a diagram from Landwerth’s blog:

image

I presume that the top left box (.NET Framework) has not been promised as open source, but the other two boxes have. Note that ASP.NET 5 will run on either .NET Core or the full .NET Framework; and that .NET Native – the project to compile a .NET application as true native code – sits as part of .NET Core.

Store apps (also known as Windows Runtime apps, or Metro apps) are not covered in the above diagram, but since .NET Native currently only works for Store apps, maybe .NET Core is also the .NET runtime for Store apps. Landwerth says:

.NET Core is a modular development stack that is the foundation of all future .NET platforms. It’s already used by ASP.NET 5 and .NET Native.

There are also some clues about .NET Core in the home page for the github repository:

.NET Core and the .NET Framework have (for the most part) a subset-superset relationship. .NET Core is named "Core" since it contains the core features from the .NET Framework, for both the runtime and framework libraries. For example, .NET Core and the .NET Framework share the GC, the JIT and types such as String and List<T>. We’ll continue improving these components for both .NET Core and .NET Framework.

.NET Core was created so that .NET could be open source, cross platform and be used in more resource-constrained environments. We have also published a subset of the .NET Reference Source under the MIT license, so that you and the community can port additional .NET Framework features to .NET Core.

The second paragraph is intriguing. Microsoft has posted parts of the source for the .NET Framework library so that the community can port some of it to .NET Core. What this means I think is not that this code should be part of .NET Core (otherwise it becomes more than just core) but rather that it would run on .NET Core.

It seems, contrary to what you might have thought, that the full .NET Framework is not a superset of .NET Core, although it is intended to be close to that. This has interesting implications for future compatibility. If .NET Core is intended to be more agile and to evolve more rapidly than the .NET Framework, since it is somewhat free of backwards compatibility constraints, we will soon find that there are features in .NET Core that do not exist in the .NET Framework as well as vice versa, in other words, two incompatible stacks. That could be a problem.

Despite Microsoft’s impressive openness in publishing much of its .NET work and forming the .NET Foundation, I for one would appreciate a clearer presentation of the plans for .NET Core and .NET Framework and the extent to which .NET Framework should now be considered a legacy or Windows desktop only technology. I suspect the answer for the moment is “wait for Build.”

On the Register in November: Windows desktop development woes, inside Amazon Aurora, more

As I mentioned last month, I have been working part-time for the Register since the beginning of October. Here’s what I wrote last month:

Google Chrome on Windows ‘completely unusable’, gripe users

This one seemed to strike a chord: is Google neglecting its Windows users?

Pity the poor Windows developer- The tools for desktop development are in disarray

And this. The price of the Windows 8 adventure was lost momentum on the desktop side. If Universal Apps come good, it might yet have been worth it.

Inside Aurora- how disruptive is Amazon’s MySQL clone?

A more detailed look at what makes Amazon Aurora tick.

The cloud that goes puff- Seagate Central home NAS woes

Not just about the need for backup, but a note on how hard it is to read the drive outside its enclosure, even when it is not faulty.

Microsoft adds video offering to Office 365. Oh NOES, you’ll need Adobe Flash

That dreaded syncing feeling- Will Microsoft EVER fix OneDrive?

The weak spot in Microsoft’s cloud story.

Amazon, Docker hop in bed- What happens in Vegas WON’T stay in Vegas

Microsoft .NET released from its Windows chains… but what ABOUT MONO?

Interview with Miguel de Icaza on Microsoft’s big open source news. Check the forum at the .NET Foundation for more on this.

Boxing clever- Amazon Fire TV is SO CLOSE to being excellent

Review of Amazon’s video streamer and budget games box.

All but full-fat MS Office to be had on iPads, Droidenslabben for NOWT

Microsoft improves Azure SQL Server cloud service, simultaneously makes it worse

Improving JavaScript- Google throws AtScript into the mix

Quick reflections on Amazon re:Invent, open source, and Amazon Web Services

Last week I was in Las Vegas for my first visit to Amazon’s annual developer conference re:Invent. There were several announcements, the biggest being a new relational database service called RDS Aurora – a drop-in replacement for MySQL but with 3x write performance and 5x read performance as well as resiliency benefits – and EC2 Container Service, for deploying and managing Docker app containers. There is also AWS Lambda, a service which runs code in response to events.

You could read this news anywhere, but the advantage of being in Vegas was to immerse myself in the AWS culture and get to know the company better. Amazon is both distinctive and disruptive, and threes things that its retail operation and its web services have in common are large scale, commodity pricing, and customer focus.

Customer focus? Every company I have ever spoken to says it is customer focused, so what is different? Well, part of the press training at Amazon seems to be that when you ask about its future plans, the invariable answer is “what customers demand.” No doubt if you could eavesdrop at an Amazon executive meeting you would find that this is not entirely true, that there are matters of strategy and profitability which come into play, but this is the story the company wants us to hear. It also chimes with that of the retail operation, where customer service is generally excellent; the company would rather risk giving a refund or replacement to an undeserving customer and annoy its suppliers than vice versa. In the context of AWS this means something a bit different, but it does seem to me part of the company culture. “If enough customers keep asking for something, it’s very likely that we will respond to that,” marketing executive Paul Duffy told me.

That said, I would not describe Amazon as an especially open company, which is one reason I was glad to attend re:Invent. I was intrigued for example that Aurora is a drop-in replacement for an open source product, and wondered if it actually uses any of the MySQL code, though it seems unlikely since MySQL’s GPL license would require Amazon to publish its own code if it used any MySQL code; that said, the InnoDB storage engine code at least used to be available under a dual license so it is possible. When I asked Duffy though he said:

We don’t … at that level, that’s why we say it is compatible with MySQL. If you run the MySQL compatibility tool that will all check out. We don’t disclose anything about the inner workings of the service.

This of course touches on the issue of whether Amazon takes more from the open source community than it gives back.

image
Senior VP of AWS Andy Jassy

Someone asked Senior VP of AWS Andy Jassy, “what is your strategy of contributing to the open source ecosystem”, to which he replied:

We contribute to the open source ecosystem for many years. Zen, MySQL space, Linux space, we’re very active contributors, and will continue to do so in future.

That was it, that was the whole answer. Aurora, despite Duffy’s reticence, seems to be a completely new implementation of the MySQL API and builds on its success and popularity; could Amazon do more to share some of its breakthroughs with the open source community from which MySQL came? I think that is arguable; but Amazon is hard to hate since it tends to price so competitively.

Is Amazon worried about competition from Microsoft, Google, IBM or other cloud providers? I heard this question asked on several occasions, and the answer was generally along the lines that AWS is too busy to think about it. Again this is perhaps not the whole story, but it is true that AWS is growing fast and dominates the market to the extent that, say, Azure’s growth does not keep it awake at night. That said, you cannot accuse Amazon of complacency since it is adding new services and features at a high rate; 449 so far in 2014 according to VP and Distinguished Engineer James Hamilton, who also mentioned 99% usage growth in EC2 year on year, over 1,000,000 active customers, and 132% data transfer growth in the S3 storage service.

Cloud thinking

Hamilton’s session on AWS Innovation at Scale was among the most compelling of those I attended. His theme was that cloud computing is not just a bunch of hosted servers and services, but a new model of computing that enables new and better ways to run applications that are fast, resilient and scalable. Aurora is actually an example of this. Amazon has separated the storage engine from the relational engine, he explained, so that only deltas (the bits that have changed) are passed down for storage. The data is replicated 6 times across three Amazon availability zones, making it exceptionally resilient. You could not implement Aurora on-premises; only a cloud provider with huge scale can do it, according to Hamilton.

image
Distinguished Engineer James Hamilton

Hamilton was fascinating on the subject of networking gear – the cards, switches and routers that push bits across the network. Five years ago Amazon decided to build its own, partly because it considered the commercial products to be too expensive. Amazon developed its own custom network protocol stack. It worked out a lot cheaper, he said, since “even the support contract for networking gear was running into 10s of millions of dollars.” The company also found that reliability increased. Why was that? Hamilton quipped about how enterprise networking products evolve:

Enterprise customers give lots of complicated requirements to networking equipment producers who aggregate all these complicated requirements into 10s of billions of lines of code that can’t be maintained and that’s what gets delivered.

Amazon knew its own requirements and built for those alone. “Our gear is more reliable because we took on an easier problem,” he said.

AWS is also in a great position to analyse performance. It runs so much kit that it can see patterns of failure and where the bottlenecks lie. “We love metrics,” he said. There is an analogy with the way the popularity of Google search improves Google search; it is a virtuous circle that is hard for competitors can replicate.

Closing reflections

Like all vendor-specific conferences there was more marketing that I would have liked at re:Invent, but there is no doubting the excellence of the platform and its power to disrupt. There are aspects of public cloud that remain unsettling; things can go wrong and there will be nothing you can do but wait for them to be fixed. The benefits though are so great that it is worth the risk – though I would always advocate having some sort of plan B and off-cloud (or backup with another cloud provider) if that is feasible.

Microsoft’s Azure outage: a troubling account of what went wrong

Microsoft’s Jason Zander has published an account of what went wrong yesterday, causing failure of many Azure services for a number of hours. The incident is described as running from 0.51 AM to 11.45 AM on November 19th though the actual length of the outage varied; an Azure application which I developed was offline for 3.5 hours.

Customers are not happy. From the comments:

So much for traffic manager for our VM’s running SQL server in a high availability SQL cluster $6k per month if every data center goes down. We were off for 3 hrs during the worst time of day for us; invoicing and loading for 10,000 deliveries. CEO is wanting to pull us out of the cloud.

So what went wrong? It was a bug in an update to the Storage Service, which impacts other services such as VMs and web sites since they have a dependency on the Storage Service. The update was already in production but only for Azure Tables; this seems to have given the team the confidence to deploy the update generally but a bug in the Blob service caused it to loop and stop responding.

Here is the most troubling line in Zander’s report:

Unfortunately the issue was wide spread, since the update was made across most regions in a short period of time due to operational error, instead of following the standard protocol of applying production changes in incremental batches.

In other words, this was not just a programming error, it was an operational error that meant the usual safeguards whereby a service in one datacenter takes over when another fails did not work.

Then there is the issue of communication. This is critical since while customers understand that sometimes things go wrong, they feel happier if they know what is going on. It is partly human nature, and partly a matter of knowing what mitigating action you need to take.

In this case Azure’s Service Health Dashboard failed:

There was an Azure infrastructure issue that impacted our ability to provide timely updates via the Service Health Dashboard. As a mitigation, we leveraged Twitter and other social media forums.

This is an issue I see often; online status dashboards are great for telling you all is well, but when something goes wrong they are the first thing to fall over, or else fail to report the problem. In consequence users all pick up the phone simultaneously and cannot get through. Twitter is no substitute; frankly if my business were paying thousands every month to Microsoft for Azure services I would find it laughable to be referred to Twitter in the event of a major service interruption.

Zander also says that customers were unable to create support cases. Hmm, it does seem to me that Microsoft should isolate its support services from its production services in some way so that both do not fail at once.

Still, of the above it is the operational error that is of most concern.

What are the wider implications? There are two takes on this. One is to say that since Azure is not reliable try another public cloud, probably Amazon Web Services. My sense is that the number and severity of AWS outages has reduced over the years. Inherently though, it is always possible that human error or a hardware failure can have a cascading effect; there is no guarantee that AWS will not have its own equally severe outage in future.

The other take is to give up on the cloud, or at least build in a plan B in the event of failure. Hybrid cloud has its merits in this respect.

My view in general though is that cloud reliability will get better and that its benefits exceed the risk – though when I heard last week, at Amazon Re:Invent, of large companies moving their entire datacenter infrastructure to AWS I did think to myself, “that’s brave”.

Finally, for the most critical services it does make sense to spread them across multiple public clouds (if you cannot fallback to on-premises). It should not be necessary, but it is.

Microsoft promises to fix OneDrive sync in Windows 10, with one engine for Business and Consumer

Microsoft’s Jason Moore has responded to feedback on the change to OneDrive sync in the latest Windows 10 preview. The change removed the “placeholder” feature, where OneDrive files and metadata all show up in Windows Explorer, but do not actually download until requested. It was not a popular move among Windows power users, as reported here.

It turns out there is more going on here than merely tweaking a feature. In his response, Moore states:

We stepped back to take a fresh look at OneDrive in Windows. The changes we made are significant. We didn’t just “turn off” placeholders – we’re making fundamental improvements to how Sync works, focusing on reliability in all scenarios, bringing together OneDrive and OneDrive for Business in one sync engine, and making sure we have a model that can scale to unlimited storage. In Windows 10, that means we’ll use selective sync instead of placeholders. But we’re adding additional capabilities, so the experience you get in Windows 10 build 9879 is just the beginning. For instance, you’ll be able to search all of your files on OneDrive – even those that aren’t sync’ed to your PC – and access those files directly from the search results. And we’ll solve for the scenario of having a large photo collection in the cloud but limited disk space on your PC.

This is good news since it goes to the heart of a more serious issue: the poor implementation of OneDrive sync in Windows, especially in the “Business” edition which has a sync engine based on Office Groove. The consumer OneDrive sync is not perfect either, with a tendency to create duplicate files if you use more than one PC. There is also some kind of bug which means you can edit a file, save it, email it as an attachment, and find that you actually emailed an old version (this has happened to me when submitting articles to editors; no fun).

I have written more on OneDrive issues and confusions here. The poor sync experience with OneDrive for Business is perhaps the weakest point in Office 365 currently; a significant problem.

Now we will get a single sync engine across both versions of OneDrive. If it is also a better sync engine than either of the current ones, Microsoft’s cloud customers will be delighted.

Moore adds: “Longer term, we’ll continue to improve the experience of OneDrive in Windows File Explorer, including bringing back key features of placeholders.”

Questions remain of course. Will Microsoft unify the server technology as well as the sync engines? Will the new sync engine come to Windows 7 and 8 as well as 10? Will the company fix the mobile apps as well? Will OneDrive ever approach the fast, seamless sync achieved by Dropbox?

Watch this space.