When is AI useful in the real world?

TL;DR: AI is useful for a real-world task only if the cost for the AI to do the thing and for you to check its work is less than the existing solution.


AI is quickly getting better at many things. This excites many people, and justly so.

But an AI’s work doesn’t get shipped unless it’s checked, assuming you don’t want to go out of business.1 So it’s important to remember in this age of AI hype that:

The time it takes the AI to do the task and for you to check its work has to be less than the existing solution, otherwise the AI isn’t useful.

People seem inclined to forget this when prognosticating about AI. But you have to do the full cost accounting for how the AI will get used in order to assess the impact.

Of course if the task your AI is doing is something humans can’t do at all, then no matter how long the AI takes, it might be worth it. And if your AI is really good, then you may only have to check its work once, or occasionally. But both these situations are really quite rare in modern AI applications.2 Definitely more rare than much media lets on.

This is why AI makes better minions than prophets, in general. A minion repeatedly performs a task very similar to the one it was validated on. A prophet tells you something you don’t know, can’t check, and maybe will never be able to check. It’s almost always easier to check the former’s work than the latter’s.

Also, it goes without saying, this is not the only requirement for AI to be useful in the real world. Just the one I see people miss most and that I’m ranting against here.

Case studies

Good example: code generation for front-end component (as of 2020)

Existing solution:

  1. Non-expert human writes code (potentially ages)
  2. Non-expert human checks that rendered component works (2 mins)

AI solution:

  1. AI generates code (instant)
  2. Non-expert human checks that rendered component works (2 mins)

Bad example: code generation for back-end code (as of 2020)

Existing solution:

  1. Non-expert human writes code (potentially ages)
  2. Non-expert human checks code correctness (also potentially ages)

AI solution:

  1. AI generates code (instant)
  2. Non-expert human reads, fully understands, corrects, and tests code, or hires someone to do all this (ages x 2)

Existing solution:

  1. Human writes a letter, checking it while they write (20 mins, or way more if they’ve never written a legal letter)
  2. Human proofreads letter (1 min)

AI solution:

  1. AI generates letter Time: instant
  2. Human proofreads letter to make sure it isn’t crazy (1 min)

Good example: generate stock media, as done by Rosebud.ai

Existing solution:

  1. Human goes out and produces a photoshoot/videoshoot, etc. (hours to days) or buys stock media (not cheap)
  2. Human verifies that media looks okay (2 mins)

AI solution:

  1. AI generates media (instant)
  2. Human verifies that media looks okay (2 mins)

Questionable example: checkout-free stores

It’s not clear how much “AI” Amazon Go and its various competitors use. And to the extent that they do, it’s not clear how accurate they’re going to be able to make it. There’s a lot of innovation involved here that’ll be interesting to watch.

But bottom line: they’re going to have to spend some money to pay human-in-the-loop operators at the beginning, plus the usual human duties of cleaning and stocking the store. So we’ll see whether all the factors net out to a cost lower than that of just having employees in the store do the usual job.


  1. Note: this doesn’t include the case where the goal of your AI is to make you sound good to investors. For that I recommend going to arXiv and assembling an arsenal of buzzwords from paper abstracts. ↩︎

  2. Note: the “real world” AI applications I’m talking about don’t include the case where your AI application is just selling direct access to an AI model, like AWS Rekognition or GPT-3. These are great business models for now, but not what I’m talking about. ↩︎