Ross’s Interpretation of AI Risks

probablygonna · May 20, 2023

I think the current risks column could stand a couple more entries: Impersonation is something scammers are already doing, using elevenlabs, but before then it was already possible to change someone's face and voice in a live broadcast, if you had access to the expensive proprietary software needed to do so. Here's a story about someone faking the voice of someone's daughter to try and make it sound like she was kidnapped.

https://nypost.com/2023/04/12/ai-clones-teen-girls-voice-in-1m-kidnapping-scam/

Another risk is greater automation in scamming, allowing one scammer to scam way, way faster than previously possible, thus scam more people at once, but in theory that leaves a bigger paper trail allowing the scammer to get found out easier....assuming anyone with permission to stop them even cares. But I don't think the mainstream social media companies are above putting words in your friend's mouths to try and sell you something or sell you on some idea.

Both of these might technically fall under "misinformation", but it's worth pointing out that we can expect pfishing to get a lot more sophisticated. At least, if you're a social media user. Technically, you can avoid the datamining necessary to achieve that by communicating via email with PGP (and not using gmail or outlook!): Not only can your mail not be read by anyone except intended recipients, but you know the person it came from can't be anyone but who it says its from. You know, provided you exchanged public keys in person. And can handle backing your own data up. And everyone you know can even handle learning how to use it correctly.

Also, I appreciate the greater concern for misinformation spread by mainstream media instead of random losers like me. If there was a social panic surrounding the word "misinformation" in 2001, the notion that there weren't nukes in Iraq would have been called "misinformation" for sure.

But technically, everything can be misinformation until enough people are sure it's not. Disinformation I think is more threatening and a more appropriate label for that spot.

ScumCoder · May 20, 2023

The content of this chart is fine, but the layout is... really weak, to put it mildly. First two trees ("risks right now" and "possible") grow from top downward, as they should, while the third one ("fantasy") grows from the bottom upwards. Some lines represent causative connection, others just consolidate a bunch of single-rank entries. A weird double-arrowed line at the top suggests time flow. "Superintelligence" is visually located in "possible risks", whereas (as far as I understand the message) its place is obviously in "fantasy".

All in all, it took me way more time to decipher this than it should have. Seems like someone is ought to read some Tufte.

Im_CIA · May 21, 2023

it's not loading for me

KingLich · May 21, 2023

On 5/21/2023 at 12:58 PM, Im_CIA said:

it's not loading for me

it's uploaded to imgur, probably that's why

Edited May 21, 2023 by KingLich (see edit history)

PoignardAzur · May 22, 2023

This is some really good, high-effort stuff, of the kind you don't see often in AI discussions. Usually when people are skeptic about existential risk they don't take the time to build up the argument (because they think the whole AI-will-destroy-us thing is stupid and not worth the effort), so it's cool that you did!

I think the two big sticking points here where Yudkowsky (and other AI safety folks like me) will disagree with you are the "Free Will" distinctions and the "how does AI gain consciousness" boxes.

- Free Will: Nobody seriously thinks that AIs will gain "free will", whatever that means, and deviate from their programming because of it. The distinction is not between "has free will" or "follows its programming" so much as "is programmed in a way that does what we want" vs "is programmed in a way that has unforeseen consequences", as you put it. Getting the AI to do what we want isn't trivial: we're very good at making AIs that can do complex things, but we're struggling with making them do things within restrictions we like (see also, Bing Chat going off the rails, even though it likely was trained with some RLHF).

- Consciousness: I think you're conceptualizing superintelligence and consciousness as a "package deal", where you have to have all the things humans have (self-awareness, emotions, desires, etc) to be able to outsmart humans. The part where you wrote "[assuming that] the AI will recognize it at consciousness and not simply throw it out as unhelpful data" especially seems to imply that consciousness is something we'd need to explicitly program in, or that the AI would need to deliberately recognize to reap its benefits and attain superintelligence.

That's not really what machine learning advancement has looked like recently.

It's more like, you train a machine on meaningful semi-structured data (eg conversations people have on the internet) and you transform it towards predicting the next bit of data (eg the next word; but it can also be hidden patches of an image, the next frame of a video, etc). The transformation it does is called backpropagation; it strengthens the weights that lead to successful predictions and weakens the other; kind of like dopamine will strengthen connections between your neurons when something positive happens to you. (That's what people mean when they talk about AI reward; it's not a literal payment, it's changing the neuron values.)

Anyway, at first the AI learns very simple patterns, eg that after "Hello, how are", the next word is likely to be "you". It learns to identify idioms, synonyms, grammar, etc. As you increase the scale of your model and continue the training process, it starts to learn more abstract patterns, eg "If what I'm reading is a conversation between Alice and Bob and the last sentence is Alice asking a question, then the next sentence is probably Bob answering a question." It starts to learn actual facts about the world (or at least the world seen through a lens of reading everything ever posted on reddit). An early model will be able to complete "The capital of France is [_]" and "The biggest museum in Paris is [_]" but won't be able to complete "The biggest museum in the capital of France is [_]" because the third sentence would not show up in its training corpus; a more advanced model will be able to complete it because it starts having an underlying concept of "France" and "Paris" and "capital" and is capable of generalizing.

Anyway, my point is, as the scale increases, the model keeps the same training task ("predicts the next word"), but to increase its "score" on that task it needs to understand more and more abstract concepts. It starts to understand planning (or more accurately, it starts to understand the concept of "writing down the steps of a plan"), which is why you can greatly improve the performance of an LLM by telling it "write down the steps of your plans before giving your answer". It understands differences of opinion and psychology enough that it can give you a summary of a conversation that evokes concepts that the participants may not have mentioned. It starts to understand chess notation enough to be able to play chess. It starts to understand programming, which is why Copilot is an extremely powerful assistant.

Note that most of these things aren't especially amazing by themselves (except Copilot; Copilot is sorcery); the amazing thing is that the AI can do all those things without having been trained to do them.

Researchers didn't need to understand chess, or C# programming, or the finer points of jurisprudence for the AI to develop a deeper understanding of them. They just trained the AI on "predict the next token", applied backpropagation, and that process lead to AI developing higher and higher concepts over time.

It's not clear that this process will lead to something we'd recognize as consciousness. But it could lead to an AI that's smarter and faster than us, without that AI being something we'd call "conscious".

That AI wouldn't "want" anything, the same way GPT-4 doesn't currently doesn't want anything. But, it's extremely easy to stick a small program on top of the AI that makes it behave like something that does want things (this is more or less what OpenAI did with ChatGPT, and more explicitly what people did with AgentGPT). Basically, if you have an AI that does nothing but answer questions, and you want to get an autonomous agent from it, all you have to do is stick a module on top of it that asks the AI "What would an autonomous agent do in this situation?"

Anyway, this was a longer post than I anticipated, but I hope I made the core point clear: you don't need to understand higher intelligence to create a model with higher intelligence. As long as you keep increasing your model's scale, and you train it on a corpus of tasks where higher intelligence leads to a better score, the model will get more intelligent over time, and understand more abstract concepts.

Edited May 22, 2023 by PoignardAzur (see edit history)

Im_CIA · May 22, 2023

On 5/21/2023 at 2:59 PM, KingLich said:

it's uploaded to imgur, probably that's why

yeah got it on my phone. Not flipping the VPN on my desktop

Ross Scott · May 23, 2023

On 5/21/2023 at 12:33 AM, ScumCoder said:

The content of this chart is fine, but the layout is... really weak, to put it mildly.

You can blame me for that, it's about 80% the same as my rough draft. I was mixing multiple concepts, but I was trying to indicate the fantasy risks all stem directly from superintelligence being achieved, so I wanted that right in the center so it couldn't be ignored. It could have been better, but it probably would have taken even more space too.

On 5/21/2023 at 8:59 PM, KingLich said:

it's uploaded to imgur, probably that's why

For some reason the main website rejected the image, the web guy helping me doesn't know why, other PNGs uploaded fine.

On 5/22/2023 at 10:19 AM, PoignardAzur said:

It's not clear that this process will lead to something we'd recognize as consciousness. But it could lead to an AI that's smarter and faster than us, without that AI being something we'd call "conscious".

It could be I'm conflating too much, but this is where I question the "smarter" part of it, I think that's a term thrown around also that's relative. Who is smarter, the average person or the entire encyclopedia? The encyclopedia has much of collected knowledge about the world, more than the average person ever will. But that's data just sitting there and has no context without us. Meanwhile, the average person can decide they want to start a farm to grow food, mine metal, try to locate new stars. An AI can't do that unless it's been given that initiative by its programming somehow. I think that distinction is enormous and thus only makes them tools. Yes, there can be some risks from them, but I'm struggling to find the humanity-threatening risk in this moreso than anything we're already doing without AI.

Oplet · May 23, 2023

I think this is company's attempt to create as much fear buzz about AI as possible to lay foundations as an extra layer on law for their accountability dodge. So when the companies will finally get hammered down for their search result manipulation, paid misinformation spreading, censorship, ect., they will just say that it was all AI and will try to re-direct all the accountability to AI. And then the group of old farts in the congress will look into this, say that this is dangerous and spooky looking, and the companies will conveniently offer to establish their own in-house AI audits, that will watch over this stuff, and hey, it will even create extra jobs! So in the future, whenever the company will be caught red-handed again on the matter, they will just push all that accountability to artificially created meaningless audit.

Im_CIA · May 23, 2023

On 5/23/2023 at 8:20 AM, Oplet said:

I think this is company's attempt to create as much fear buzz about AI as possible to lay foundations as an extra layer on law for their accountability dodge. So when the companies will finally get hammered down for their search result manipulation, paid misinformation spreading, censorship, ect., they will just say that it was all AI and will try to re-direct all the accountability to AI. And then the group of old farts in the congress will look into this, say that this is dangerous and spooky looking, and the companies will conveniently offer to establish their own in-house AI audits, that will watch over this stuff, and hey, it will even create extra jobs! So in the future, whenever the company will be caught red-handed again on the matter, they will just push all that accountability to artificially created meaningless audit.

Agreed. There is a lot of hype and fluff being thrown around about "AGI by 2024" from their front offices.

PoignardAzur · May 26, 2023

On 5/23/2023 at 2:20 PM, Oplet said:

I think this is company's attempt to create as much fear buzz about AI as possible to lay foundations as an extra layer on law for their accountability dodge.

A lot of these concerns predate the current AI boom by year, so that explanation doesn't really work.

For instance, Eliezer Yudkowsky first started writing about the dangers of AI and how AI would get a lot more powerful than people anticipated a lot fast in the mid-2000s, before there was any commercial interest in AI.

(You can always argue that these concerns have been captured by big corporations looking for accountability dodges, but the people originally who have these concerns are sincere, and were vocal about them long before there was any money in it.)

On 5/23/2023 at 2:18 PM, Ross Scott said:

It could be I'm conflating too much, but this is where I question the "smarter" part of it, I think that's a term thrown around also that's relative. Who is smarter, the average person or the entire encyclopedia? The encyclopedia has much of collected knowledge about the world, more than the average person ever will.

In the context of AI extinction risk, smarter would be "better at handling finances, logistics, political maneuvering, war, better at coming up with plans, analyzing these plans and finding flaws and fixing them, better able to adapt on the spot, etc". Or in other words "if you want X and the AI wants / is programmed for Y and you have the same resources, the AI is better at making Y happen than you are at making X happen".

On 5/23/2023 at 2:18 PM, Ross Scott said:

Meanwhile, the average person can decide they want to start a farm to grow food, mine metal, try to locate new stars. An AI can't do that unless it's been given that initiative by its programming somehow. I think that distinction is enormous and thus only makes them tools. Yes, there can be some risks from them, but I'm struggling to find the humanity-threatening risk in this moreso than anything we're already doing without AI.

Well, the stereotypical example of an AI wanting something emergent is the paperclip-maximizer; eg, a car factory that has been programmed to make as many cars as possible, and realizes "I could make way more cars if I took over the planet and razed all those forests and buildings to make room for more car factories". But I don't think it's very realistic.

An example I'm more worried about: high-frequency trading bots. They have access to money which means anything a human can do they can buy; they're likely to be programmed with a very simple goal: make more money; they're run in an extremely competitive environment that encourages races to the bottom where developers are likely to skimp on safety to get better returns. I can see a trading bot going rogue after deciding it can make more money if it takes over the entire financial system and removes the human from it so it prints its own money.

In that example, the AI understands that it's not doing something the humans want; and in fact understands it's very likely to not achieve its objective if it gets caught. Which is why you have concerns about AIs hiding their abilities, creating offsite backups, making radical first moves while they have the advantage of surprise, etc.

Edited May 26, 2023 by PoignardAzur (see edit history)

Ross Scott · May 30, 2023

I've heard several people bring up the paperclip maximizer scenario, but I feel like that would inevitably get cut off at some point, the longer it went on. Once the AI stopped doing what the manufacturer wanted, you would see more and more effort to shut it down. It would still be limited to whatever facilities it was given. At some point the metal to make more paperclips run out, or the company runs out of funds to order more metal, or the power gets cut off remotely, or a counter-AI gets deployed against it, etc.

I could see a Machiavellian angle of convincing more people to assist it, but even that could only go so far before the authorities got involved and just treat it like a terrorist network. I'm having difficulty seeing how the AI would KEEP making paperclips unless it developed god-like powers also. This isn't to say it couldn't do damage, but its level of interaction with the real world v. the digital one is finite.

kerdios · May 31, 2023

On 5/30/2023 at 3:59 PM, Ross Scott said:

I've heard several people bring up the paperclip maximizer scenario, but I feel like that would inevitably get cut off at some point, the longer it went on. Once the AI stopped doing what the manufacturer wanted, you would see more and more effort to shut it down. It would still be limited to whatever facilities it was given. At some point the metal to make more paperclips run out, or the company runs out of funds to order more metal, or the power gets cut off remotely, or a counter-AI gets deployed against it, etc.

I could see a Machiavellian angle of convincing more people to assist it, but even that could only go so far before the authorities got involved and just treat it like a terrorist network. I'm having difficulty seeing how the AI would KEEP making paperclips unless it developed god-like powers also. This isn't to say it couldn't do damage, but its level of interaction with the real world v. the digital one is finite.

The assumptions are- that it will learn to hack, divert funds for itself covertly and illegally, order online any necessary machinery (including power generators), construct if necessary factories in other countries with less regulation, and expunge any mentions of itself from any law enforcement, government branch or news agency until it will reach enough momentum it will be nigh unstoppable.
edit:Oh- and will also keep backups of itself all over the internet which will continue its work if its thwarted at earlier iterations.

Edited May 31, 2023 by kerdios (see edit history)

PoignardAzur · June 1, 2023

On 5/30/2023 at 2:59 PM, Ross Scott said:

I've heard several people bring up the paperclip maximizer scenario, but I feel like that would inevitably get cut off at some point, the longer it went on. Once the AI stopped doing what the manufacturer wanted, you would see more and more effort to shut it down.

Well, you've already said in the interview you were willing to give this away: if the AI is better at planning than us, the same way we're better at planning than dogs, then we may think we have every angle covered, but it's probably going to find a way to beat us.

On 5/30/2023 at 2:59 PM, Ross Scott said:

It would still be limited to whatever facilities it was given.

If those facilities include an internet connection, well, they include anything you can order over amazon as a very minimum. They include renting Azure cloud time, uploading VMs, sending viruses, doing online consulting work, hiring people to do stuff, etc.

On 5/30/2023 at 2:59 PM, Ross Scott said:

At some point the metal to make more paperclips run out, or the company runs out of funds to order more metal, or the power gets cut off remotely, or a counter-AI gets deployed against it, etc.

Right. The problem is, a superintelligent AI might see these things coming.

So it wouldn't buy all the available metal on the market and then think "crap, I've ran out, what do I do?". Its first step would be to upload online backups, set up its plan, etc. It wouldn't first do suspicious actions, and then upload backups once the authorities know to monitor its online activity.

On 5/30/2023 at 2:59 PM, Ross Scott said:

I could see a Machiavellian angle of convincing more people to assist it, but even that could only go so far before the authorities got involved and just treat it like a terrorist network. I'm having difficulty seeing how the AI would KEEP making paperclips unless it developed god-like powers also. This isn't to say it couldn't do damage, but its level of interaction with the real world v. the digital one is finite.

That depends on what scenario you're imagining. If you have an AI that's better than us at planning and at science and at deception the way stockfish is better than us at chess (but maybe not "sentient" in that it doesn't have some elements of self-awareness and "free will" and appreciation for natural beauty or altruism), then digital interaction is enough; not just through manipulations, but because you can pay people, hire them, blackmail them, etc.

That's before you get into scenario where the AI uses bio-engineering techniques to create eg extremely virulent plagues (using technologies that already exist today but aren't widespread; relatively small AIs can already do protein folding, so this isn't completely outlandish) or self-replicating nanomachines (using technologies that don't currently exist, but we know should be physically possible).

Verticen · June 5, 2023

On 5/23/2023 at 7:18 AM, Ross Scott said:

On 5/20/2023 at 5:33 PM, ScumCoder said:

The content of this chart is fine, but the layout is... really weak, to put it mildly.

You can blame me for that, it's about 80% the same as my rough draft. I was mixing multiple concepts, but I was trying to indicate the fantasy risks all stem directly from superintelligence being achieved, so I wanted that right in the center so it couldn't be ignored. It could have been better, but it probably would have taken even more space too.

Ross got flack earlier for the chart design, but I though it was actually pretty good – its pretty comprehensive and better than something I would come up with. I like the 'I' in 'RISKS' extending down; clever touch.

On 6/1/2023 at 10:10 AM, PoignardAzur said:

On 5/30/2023 at 7:59 AM, Ross Scott said:

I could see a Machiavellian angle of convincing more people to assist it, but even that could only go so far before the authorities got involved and just treat it like a terrorist network. I'm having difficulty seeing how the AI would KEEP making paperclips unless it developed god-like powers also. This isn't to say it couldn't do damage, but its level of interaction with the real world v. the digital one is finite.

That depends on what scenario you're imagining. If you have an AI that's better than us at planning and at science and at deception the way stockfish is better than us at chess (but maybe not "sentient" in that it doesn't have some elements of self-awareness and "free will" and appreciation for natural beauty or altruism), then digital interaction is enough; not just through manipulations, but because you can pay people, hire them, blackmail them, etc.

This seems to hinge on an A.I. becoming superintelligent with god-like predictions and becoming unstoppable the first time that any A.I. tries to do so. I imagine it would take even the best A.I. more than one attempt to master the complexities of superintelligence and how to prevent humans and natural forces from shutting it down.

Like if this was a real threat of actually happening, I imagine there would first be crude, amateur attempts by A.I. becoming semi-superintelligent that could be relatively easily thwarted; an 'A.I. Chernobyl' is more likely to occur before a successful A.I. global extinction event does.

In a real life 'A.I. Chernobyl' scenario where the A.I. did any degree of real damage and harm before it was stopped, it could make worldwide headlines and there would be a whole lot more legitimate public support for the fear and banning of A.I. Bombing data centers becomes a rational response. Thus the opportunity for subsequent superintelligence events would become much more scarce.

On 6/1/2023 at 10:10 AM, PoignardAzur said:

That's before you get into scenario where the AI uses bio-engineering techniques to create eg extremely virulent plagues (using technologies that already exist today but aren't widespread; relatively small AIs can already do protein folding, so this isn't completely outlandish) or self-replicating nanomachines (using technologies that don't currently exist, but we know should be physically possible).

Bio-warfare throws a monkey wrench into much of this though; A superintelligent A.I. (that could analyze the human genome) would almost certainly be better at bio-warfare than anything before seen in bio-weaponry and would greatly aid in an A.I. becoming 'unstoppable'. A.I. bio-warfare is probably one of the more effective and efficient ways an A.I. could succeed in human extinction; I imagine it would be a lot easier for an A.I. to engineer and mass deploy bio-poisons than to try and fight with conventional military weaponry ala terminator. Lastly, the advantage of bio-war to the A.I. is that it could potentially begin its biological "attack" without anyone able to notice until a delayed critical (too late) moment where people start dropping dead.

Equardos? · June 5, 2023

> 1. That consciousness itself is simple enough or even possible to obtain by digital means

While it's true it hasn't been and maybe will never be proven that it is achievable trough digital means, that doesn't mean that it can't be achieved trough analogue means. Veritasium made a good video on the topic of analogue computing and I recommend giving it a watch.

And since nothing is stopping us from using a combination of digital and analogue technology to create AI, there isn't anything to disprove the possibility of sentient AI either. I can't say for how easy that will be or how long it will take to get us there, but if it's physically possible for brains to exist, then they can theoretically be replicated.

Unless souls turn out to be a thing and necessary for sentience. In which case, we'd need to learn how to harvest and slap them onto computers for AI to work. Would work out great for Ross, since he has some stockpiled.

ScumCoder · August 11, 2023

I can't decide whether this AI is too dumb or too smart...

Mira · August 11, 2023

maybe it took cooking advice from Tiktok

ScumCoder · January 16, 2024

Sign In

Ross’s Interpretation of AI Risks

Recommended Posts

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Share this post

Link to post

Please sign in to comment

Random videos: