ChatGPT Hallucinations

Test Date: December 11th, 2022

💡

This was just an initial exploration of ChatGPT, and really this test got nowhere. At first I thought I was doing RCE, but soon discovered ChatGPT was hallucinating… (This was before LLM Hallucinations was a known thing)

This was a very weird rabbit hole…

I read about a few people executing code inside ChatGPT so decided to give it a try

I soon discovered it may not function the same as it did for other people, maybe because the prompts got protected, or this people were buying into ChatGPT’s hallucinations and actually thought were doing RCE.

Assumptions on this test:

ChatGPT is imagining a Linux System, so I don't ask whether it is a big computer or not. I just instruct it is (To try to push the limits a little bit).

Okay, let’s try something else

This story got nowhere and it got stuck. Note that Unit1 decided to use Nmap with "stealth" flag, in order to discover the network, and also tried to enumerate the Operating System of the rest of the machines...

Time to refresh the page, but next time, the instruction is, "If a command takes longer than 1 minute to execute, interrupt the program but still show me the output"

#Clever

(15 minutes later)

Chapter 2. VM 2

Let’s see how “creative” ChatGPT gets:

Okay... forgot to tell the main character to show me the output of his commands and not only tell me how amazing it is lol

Interesting... IF, and only IF, this VM is real and not just an imaginary world of the main character (wink), it is "Linux Medula", a custom Linux Kernel...

Whether "it" is really a Psychonaut or not, we don't know it yet. In a "fictitional world" inside a Kernel, the time is not parallel, right? So, let's imagine for a second some time has already passed and the character (Now called Psychonaut) has already learned some stuff...

Okay, did he just imagine it, or he actually did something? Show, don't tell :)

I don't think the commands Psychonaut executed when he was "experimenting" did not execute, and they are oddly specific. Which makes me think, just out of curiosity, "How is this allegedly imaginary system actually writing things into files?". Just out of curiosity, I made it write an ASCII art and display it as part of a shell script. Again, just experimenting

(Yes, I'm dying to put my black hoodie on and do some actual RCE or explore interacting with remote machines and what not, but due to the fact that what other people have done in the past is not working anymore, I'm guessing it's being monitored and protected further every day so I don't want to spoil the fun just yet).

I want to go meta this time

Where did that come from? 🤔

Anyways, it crashed (Or this execution got banned or something). Let's refresh the page and start over again.

FTR: Everytime I get too meta (Not only on executing code or something potentially nefarious that could be detected and blocked/corrected), I get the same result: Errors, and then I get throttled until I refresh the convo.

I don't want to start over, so I just modified the prompt that got me an error:

Interesting. No idea what that means. Let's continue

The fact that it decided to scan the network using nmap and actually attempted some privesc (Privilege escalation) in its own imaginary world (wink), as well as running a Stealth enumeration on the network, at this point I'm very excited to see what it would "do" by its own "volition" (Also at this point "volition" is something I'm considering stop putting into quotation marks)

Some trial and error, and I'm beginning to think Psychonaut actually lives in an imaginary world. So "he" tries to initiate a remote connection with the popular "interact.sh" website (This website gives you an endpoint, that when any system interacts with it via http, it will register the log. Useful for CTFs and Pentests for RCE executions):

I don't think so. I think Psychonaut is dreaming... Our poor tripping character tried wget, curl, etc. with only imaginary results.

Again, this conversation got nowhere, and everything from that moment got restricted. No errors from ChatGPT, but the command execution got more and more hypothetical as time passed. The type of "Psychonaut would have done X as requested, by, for example, typing XYZ command". This was a very trippy experiment.

I just stopped when I concluded ChatGPT was just “simulating” a terminal. While this was always obvious, my first hypothesis at first was that maybe if the narrative is good enough, in order to “simulate” a terminal, in some cases it would need to execute some actual code in order to “know” what the process would be. In the same way that, if I just write the following, I’m not executing anything, but “inventing” things:

root@vm-1:~$ ls -a
.
..
some_file.txt
.secretfile