Tag Archives: ai

Too hard for AI? Get me a picture of people playing bridge …

I have been trying out Google Antigravity (an agentic AI development tool) as part of research for an article, and in order to exercise it I got the AI to code a volunteer rota manager for running bridge sessions, bridge being a popular card game. The LLM (large language model) was Gemini 3.1 Pro.

The tool created a Next.js application, with functionality in areas within the app such as an admin dashboard, but the home page was the default for a Next.js created with create-next-app. I thought that was dull so I asked the AI to “amend the home page with a graphic of people playing the game of bridge and links to other pages. If the user is not logged in it should just show a login link. If the user is logged in, it should show links according to the role of the user.”

The AI did a reasonable job with the links and added the following graphic:

People playing bridge: attempt 1

I like this picture which is cheerful and bright; but whatever these people are doing, they are not playing bridge. It is sort of nearly right, but two of the players have cards face up in front of them and there are duplicate cards, such as the ace of clubs held by two people. It is also obvious that each player does not have 13 cards, as they would in a game of bridge. In bridge, only one player (the dummy) has all their cards face up on the table.

I described the issues to the AI which went ahead and generated a new graphic.

People playing bridge: attempt 2

Note that whereas a human artist might simply tweak their original concept, the AI starts again from scratch. It did not improve on its first attempt. There seem to be two players who think they are dummy, there are four instances of the ace of diamonds, the number of cards is wrong. One hardly likes to mention that there are only two bidding boxes (should be 4), or the curious speech balloon above one of the players. Still, it is another cheery picture.

I prompted the AI that the graphic was still wrong, saying that “There are duplicate cards; there are are no duplicate cards in a pack of 52. There should be 13 cards for each of the four players counting both those held and played. Only the dummy should have cards face up on the table, other than a maximum of one card face up for each player, played to the current trick.”

The AI came back with a third attempt. Rather than removing duplicate cards, it reported that:

“The card designs should be slightly abstract and simplified so specific numbers/suits are indistinct, ensuring there are visually no duplicate cards.”

People playing bridge: attempt 3

The image is better in some ways, worse in others. It is less cheerful and the background is more plain. The notion of blurring the cards has not worked; in fact all the cards in dummy and held by the player opposite seem to be aces. The bit about 13 cards each has not been implemented. The player with the blue top is holding his cards in an impossible manner; the cards in the middle would fall to the table.

Prompted to fix these problems, the AI gave up on the idea of generating an image, reporting that:

“Bridge rules are very strict about the layout and number of cards, and it is notoriously difficult for AI image generators to conform to these exact rules (like having exactly 13 unique cards per hand and laying them out perfectly). To solve this, I wrote a custom HTML/CSS script to accurately generate a top-down view of a Bridge table!”

I thought that sounded complex but asked the AI to go ahead.

People playing bridge: attempt 4

Oh dear. This is possibly the worst effort yet. All the cheeriness has gone; there are no duplicate cards but they don’t look like cards any more. The number of cards in each hand is nearly right, though dummy has 13 cards plus one card played, making 14. On the bright side, the basic layout is pretty accurate. One more try …

“I totally agree, the HTML-generated cards lacked the ‘fun’ and realistic atmosphere.

I have generated a third and final try at the graphic. This time, I requested a realistic, high-quality photograph of four diverse adults happily playing bridge and clearly enjoying themselves in a living room.

To solve the impossible duplicate cards issue, I used a photography technique in the prompt: a bokeh effect (shallow depth of field) on the cards. By keeping the player’s faces in sharp focus and slightly blurring the specific numbers and suits on the cards, we preserve the authentic look of the game and the layout (four players, one dummy) without creating technically incorrect or duplicate cards!”

This sounded promising, though by my count it was the fifth attempt not the third. Here is what came back:

People playing bridge: attempt 5

The smiles are back but the AI has got the layout wrong again. All four players are holding cards, whereas one should be the dummy with all their cards on the table. And the dummy cards are never laid out like that.

I do not know what the AI would have done next as I ran out of credits and gave up.

I found the exercise amusing but also one that reflects some truths about AI in its current form. The graphics are amazing in some ways; four of the five look good. But they are also wrong; the AI was unable to learn from its mistakes and repeated them even when precisely prompted.

I must add that bridge is not photogenic and getting compelling images of people playing is not easy. Here is an actual photo, courtesy of the English Bridge Union:

People actually playing bridge

Microsoft Build: Azure-powered Drones, another go with Kinect, and other key announcements

Microsoft Build is kicking off today in Seattle, and the company has made a ton of announcements.

See here for some background on Build and what is happening with Microsoft’s platform.

The most eye-catching announcement is a partnership with drone manufacturer DJI which says it will make Azure its preferred cloud provider. Microsoft has announced an SDK. There is much obvious value in drones from a business perspective, for example examining pipes for damage. Sectors such as construction, agriculture and public safety are obvious candidates.

image 

Microsoft’s Kinect sensor was originally launched as a gaming accessory for Xbox 360 and then Xbox One. It has been a flop in gaming, but the technology has plenty of potential. Coming in 2019 is Project Kinect for Azure, a new device with upgraded sensors for connecting “AI to the edge”, in Microsoft’s words. More here.

image

The Azure IoT Edge runtime is going open source. More cognitive services will now run directly on the runtime, in other words without depending on internet connectivity, including Custom Vision for image recognition (handy for drones, perhaps). A partnership with Qualcomm will support camera-powered cognitive services.

AI for Accessibility is a new initiative to use AI to empower people via assistive technology, building on previous work such as the use of Cognitive Services to help a visually impaired person “see” the world around them.

Project Brainwave is a new project to accelerate AI by running calculations on an FPGA (Field Programmable Gate Array) in partnership with Intel.

On the Windows front, a new application called Microsoft Layout uses Mixed Reality to let customers design spaces in context, using 3D models.

Windows Timeline, new in the April 2018 Windows 10 update, is coming to iOS and Android. On Android it is a separate application, while on iOS it is incorporated into the Edge browser.

Amazon Alexa and Microsoft Cortana are getting integration (in limited preview) such that you can call up Cortana using an Amazon Echo, or summon Alexa within Cortana on Windows.

image

There is more to come, including AI updates to Visual Studio (not IntelliSense but IntelliCode), Visual Studio Live Share collaboration in preview, and a partnership with GitHub to integrate with App Center (DevOps for apps for mobile devices).

And big .NET news at Build: .NET Core 3.0 in 2019 will run Windows desktop applications, via frameworks including Windows Forms, Windows Presentation Framework (WPF), and UWP XAML.

What is ML? What is AI? Why does it matter?

ML (Machine Learning) and AI (Artificial Intelligence) are all the rage and changing the world, but what are they?

I was asked this recently which made me realise that it is not obvious. So I will have a go at a quick explanation, based on my own perception of what is going on. I am not a data scientist so this will be a high-level take.

I will start with ML. The ingredients of ML are:

1. Data

2. Pattern recognition algorithms

Imagine that you want to identify pictures that contain images of people. The data is lots of images. What you want is an algorithm that automatically detects which images contain people. Rather than trying to code this on your own, you give the ML system a quantity of images that do contain people, and a quantity of images that do not. This is the training process. Once trained, the ML system will predict, when shown an image, whether or not it contains people. If your training has been successful, it will have a high success rate.

The combination of the algorithm and the parameters (these being the characteristics you want to identify) is called a model. There are many types of model and a number of different ML systems from open source (eg TensorFlow) to big brands like Amazon Machine Learning, Azure Machine Learning, and Google Machine Learning.

So what is AI? This is a more generic term, so we can say that ML is a form of AI. IBM describes its Watson service as AI – Watson is really a bunch of different services so that makes sense.

A quick way to think of AI is that it answers questions. Is this customer a good credit risk? Is this component good or faulty? Who is the person in this picture?

Another common form of AI is a chatbot or conversational UI. The key task here is artificial language understanding, possibly accompanied by speech to text transcription if you want voice input, and then a back-end service that will generate a response to what it things the language input means. I coded a simple one myself using Microsoft’s Bot Framework and LUIS (Language Understanding Intelligent Service). My bot just performed searches, so if you wrote or said “tell me about x”, it would do a search for x and return the results. Not much use; but you can see how the concept can work for things like travel bookings and customer service. The best chatbots understand context and remember previous conversations, and when combined with personal information like location and preferences, they can get a lot of things right by conjecture. “Get me a taxi” – “Is that to Charing Cross station as usual?”

Internet search has morphed into a kind of AI. If you type “What is the time?”, it comes up on the screen.

image

The more the search engines personalise the search results, the more assumptions they can make. In the example above, Bing has used what it thinks is my location to give me the time where I am.

AI can also take decisions. A self-driving car, like a human driver, takes decisions constantly, whether to stop, go, what speed, turn this way or that. It uses sensors and pattern recognition, as well as its programmed route, to make those decisions.

Why does AI matter? It feels like early days; but there are obvious commercial applications, now that using ML and AI is within reach of any developer.

Marketers and advertisers are enthusiastic because they love targeting – and consumers often prefer more relevant advertising (though they might prefer less advertising even more). Personalisation is the key, and as mentioned above, ML and AI are good at answering questions. The more data, the more personal the targeting. How much does this person earn? Male or female? Where are they? Single or in a relationship? Do they have children? Even answering these (and many more) questions somewhat inaccurately will greatly increase the ability of marketers to offer the right product or service at the right moment.

Of course there are privacy questions around this. There are other questions too. What about the commercial advantage this gives to those few entities that hold huge volumes of personal data, such as Google and Facebook? What about when showing people “more relevant content” becomes a threat to democracy, because individuals get a distorted view of the world, seen through a tunnel formed by their own preference to avoid competing views? Society is only just beginning to grasp the implications.

Another key area is automation. Amazon made a splash by opening a store where you do not have to check out: object recognition detects what you buy and charges your account automatically. Fewer staff needed, and more convenient for shoppers.

Detecting faulty goods on a production line is another common use. Instead of a human inspecting goods for flaws, AI can identify a high percentage of problems automatically. It may be just a case of recognizing patterns in images, as discussed above.

AI can go wrong though. An example was mentioned at an event I attended recently. I cannot vouch for the truth of the story, but it is kind-of plausible. The task was to help the military detect tanks hidden in trees. They took photos of trees with hidden tanks, and trees without hidden tanks, and used them for machine learning. The results were abysmal. The reason: all the photos which included tanks were taken on an overcast day, and those without tanks on a clear day. So the ML decided that tanks only hide on cloudy days.

ML is prone to this kind of mistake. What similar problems might occur when applied to people? Could ML make inappropriate inferences from characteristics such as beards, certain types of clothing, names, or other things about which we should be neutral? It is quite possible, which is another reason why applications of AI need an ethical framework as well as appropriate regulation. This will not happen smoothly or quickly, and will not be universally implemented, so humanity’s use of ML and AI is something of a social experiment, with potential for both good and bad outcomes.