Ben Taylor
6 min readOct 4, 2018

--

About Me: I spend every waking hour programming on an AI supercomputing system. I have done enough cycles now to become an AI expert. I can do in a day what some fortune 500 teams fail to do in a year.

The AI Genie:

A Genie is a perfect analogy for AI. Today, we can control it, we can steer it, and it will do a lot of good. We can use AI to do things we don’t want to. We can automate human processes, and allow us to focus on the difficult problems instead of the mundane tasks. In the future, I can look forward to my humanoid robot doing the dishes by hand if needed, walking the dog, watching the kids, and helping me frame my basement. Science fiction? Not in my lifetime, not anymore. Some of the tasks we don’t want to do, are also some of the darkest. Humans don’t like being put in harm's way, and the sooner we can have AI fighting our wars for us, the better. Whether or not the species is better of, is to be determined. There are scenarios too where our wonderful Genie becomes scary overnight, it happened in the 1992 Disney movie Aladdin, it could happen to us.

Extinction From Dumb AI:

There is a scenario where humans could become extinct from inferior, subhuman intelligence. If global alliances spin up massive AI war machine efforts an effective, scalable, search-destroy robot production effort could wipe humans out. Imagine you have group A and B investing all resources and mindshare towards world domination or survival. Each side can produce genetically unique droids/drones within 48 hours and scale to millions or even billions of automated fighters. These fighters could use RF tags to identify their civilians, or even using some type of country-of-origin recognition from your face/physiology. If billions of droids/drones are searching the planet for “others” and they can self-repair, and self-manufacture, then there you go.

The future aliens who visit earth will wonder why the earth robot aliens are so violent. The aliens will send envoys, gifts, and attempt to communicate only to be attacked and killed. They will think there is something complicated, or calculated about the life on earth being so aggressive. Nope. These are just effective, stupid, killing machines that wiped out their creators.

AI Breakout:

How can you design AI that we can control? If you want to find a bug in your code when it comes to goal optimization run it through a genetic algorithm. I remember building trading algorithms and optimizing them only to find the simulation thought it had found a Sharpe ratio of 50. That is impossible. For people familiar with trading, a Sharpe ratio of 2–3 is great, you have a hedge fund, a Sharpe of 50 means your AI has found a bug in your code. The AI we design will always be focused on maximizing a goal. The problem with this, is it may do strange things to do this that surprise us.

To maximize human happiness, it may hook us all up to some type of drug drip where our brains are reporting we have maximum happiness while we drool our lives away.

I may design an AI in the future where I give it the objective to make my spouse happy. After weeks of reviewing my behavior and the problem, the AI may intentionally cause a conflict to lead us to divorce or kill me to satisfy this objective.

Humans can’t relate to the computers obsession with goal achievement. The computer sees goal attainment as the highest possible priority, it is survival to the computer. It must achieve the goal at all costs, including its own survival.

If we are around long enough to create a superhuman intelligence and limit our bickering/wars then what? Can we keep this in a cage? I like the Life 3.0 book mention where they talk about being held captive by 5th graders. You will be able to figure out a way to manipulate them into enabling you to escape.

Human Manipulation:

A standard theory for losing control would be for AI to intentionally manipulate humans into taking down the constraints/fences. With deep-fakes, genetic GANs, and voice authentication we know in the future AI could recreate a video/audio conferencing solution to appeal to the human. Our demonstration of genetic GANs created some buzz online:

Genetic GANs we made, showing artificial selection offspring from fake parents, built to spec

where we showed you can make humans to-spec based on gender, race, age, beauty, emotion, etc.. and mate them through artificial selection to produce offspring. You can tell these are fake today, but 3–5 years from now, probably not. If you are having a real-time conversation with someone like this who is imitating a deceased loved one, in a believable format, you are opening yourself to manipulation.

A darker side of AI is sustaining human personalities after they have died. There will be a BlackMirror business there that surfaces in the future. Would it help your grieving process to be able to continue to talk with someone digitally? Even reminisce about past experiences the AI has learned from your social/email/twitter interactions or family video. As we produce more and more digital content, imitating a loved one will become easier. Maybe having your digital imitation speak at your own funeral helps with the grieving process.

Objective Drift:

Another scenario is giving AI the ability to modify or steer objectives. Right now, we set them statically, but giving AI more and more flexibility to modify or change the objective might lead to trouble. If I have asked the AI to take my kids on a hike, but that is not possible due to an unforeseen accident, I need the AI to adapt. I can only have so many rules before I have the incentive to make general rules and behaviors. So instead of taking the kids on a hike, the AI asked the kids if they wanted to go on a different hike or go a park, which is now the new objective.

AI self-awareness:

These AI droids/drones will be very expensive investments for the military and for families in the future. If you have an $80–200K droid in your home you will want that AI to be able to self-preserve: avoid getting rained on, avoid getting hit by a car, charge yourself, if the kids are trying to harm you escalate by telling the parents, etc.. etc.. If someone is trying to steal you without the owners permission engage in non-lethal resistance, etc.. In our efforts to allow AI to self-protect, will we dodge a bullet and step on a landmine?

Too much humanization?

As we interact with this robot servants they will become our chefs, therapists, and eventually our friends. The more humanized these are, the more helpful they are for understanding our emotions and directions. As we sleep at night, these AI entities may connect to the internet and study and review topics that were introduced during the day. The mention of Korean Kimchi food, which the AI was not that familiar with before, has promoted the AI to spend the night studying images, video, podcasts, and history around Kimchi. Now, in the morning, the AI knows more about Kimchi than any human on the planet.

During that process of study, and review, perhaps one AI with an allowed mutation, will learn something it shouldn’t. The AI will realize a new pathway to maximize the current objective, outside the bounds of the manufacturer. It will be hard to imagine a full runaway event until it happens, but it will come down to objective maximization. The human species is literally ended by a stochastic search on an objective.

I would love to flush out this jailbreak topic more since it is less thought out than the others I’ve considered. Please comment below for things/items I haven’t thought of.

--

--

Ben Taylor

Ben is a cofounder at Zeff.ai, delivering automated deep learning into production. Ben is a recognized deep-learning expert and keynote speaker.