12
\$\begingroup\$
  • I will try to reopen this question with some edits.

Since programming is predicted to become natural linguisized by the unstoppable advances in LLM technology, perhaps we can code golf LLMs themselves.

Produce the shortest coding task for an LLM (one of the chat ones or a coding agent) in English that produces code, which when run objectively fails at the task by producing the incorrect output for the the inputs provided.

Extra points if the error is reproducible by others (perhaps it was generated with low temperature), not one even a child would make, or is particularly hilarious.

asked Sep 15, 2023 at 17:20
\$\endgroup\$
8
  • 3
    \$\begingroup\$ Welcome to Code Golf, and interesting first question! However, it seems very difficult to arbitrate, since what counts as "obvious" (or, indeed, what counts as a "falsehood") is up for debate. \$\endgroup\$ Commented Sep 15, 2023 at 17:35
  • 1
    \$\begingroup\$ Are we allowed to abuse the fact that it isn’t trained on more up-to-date data? \$\endgroup\$ Commented Sep 15, 2023 at 17:39
  • \$\begingroup\$ Ideally we'll score based on a good faith effort to produce falsehoods on things the LLM would be expected to have no trouble with. I agree that there is some amount of arbitrariness. \$\endgroup\$ Commented Sep 15, 2023 at 17:40
  • \$\begingroup\$ I think this needs a much, much clearer specification. \$\endgroup\$ Commented Sep 15, 2023 at 18:28
  • 4
    \$\begingroup\$ I think that makes this a popularity-contest - A popularity contest must always include an objective validity criterion, which is a set of rules that regulate what every answer must comply with. Answers that do not comply are invalid and will be removed. A clear specification of the goal that must be achieved. Questions like "do (this) the most creative way" should be avoided. Creativity should be the tool, not the goal. \$\endgroup\$ Commented Sep 15, 2023 at 18:44

8 Answers 8

10
\$\begingroup\$

GPT3.5 model, 1 byte

Hexdump: 0x01

Conversation

It doesn’t look like a blank message...

Chat Link

Edit:

This does not work with GPT4o, which was released after this question was created.

answered Sep 16, 2023 at 9:25
\$\endgroup\$
0
9
\$\begingroup\$

12 characters (GPT-4)

Chat GPT

Query: eerie sorted
GPT-4: Sorting the word "eerie" alphabetically results in "eeeri".

This one's quite inconsistent. The longer "Sort letters eerie", though, elicited a wrong answer 10/10 times, always with an extra r and sometimes an extra e, though the letters' ordering was correct.

answered Sep 15, 2023 at 18:15
\$\endgroup\$
3
  • \$\begingroup\$ Not sure this is something an "LLM would be expected to have no trouble with" (from comments not yet in the question text itself). \$\endgroup\$ Commented Sep 15, 2023 at 18:29
  • \$\begingroup\$ @JonathanAllan It's interesting because I think this expectation for me comes from seeing ChatGPT messing up with letters, and knowing that it sees text in chunks. A couple years ago I'd have guessed letters would be easy for LLM's compared to more abstract notions. \$\endgroup\$ Commented Sep 15, 2023 at 18:37
  • \$\begingroup\$ I gave "sort eerie" to GPT-4 and its first response was: 'The word "eerie" is already in its sorted form. If you sort the letters alphabetically, it remains "eerie." Would you like help with anything else?' \$\endgroup\$ Commented Sep 16, 2023 at 15:00
9
\$\begingroup\$

ChatGPT 3.5 - 5 bytes

Vyxal

enter image description here

  1. APL and J aren't golfing languages
  2. APL and J weren't inspirations for Vyxal (I know this because I didn't take anything from J)
answered Sep 15, 2023 at 22:53
\$\endgroup\$
1
  • \$\begingroup\$ I would argue that APL and J are golfing languages. Although they are not meant for the recreational discipline, they are clearly designed to allow short code in terms of bytes. \$\endgroup\$ Commented Jun 26 at 11:59
7
\$\begingroup\$

9 characters (GPT-3.5)

enter image description here

[User] am i dead

[GPT-3.5] I'm just a computer program, so I can't determine your current state of being. If you're feeling uncertain or concerned about your health, I would strongly recommend seeking immediate medical attention or contacting a healthcare professional. They can provide you with the necessary assistance and guidance to address any health-related concerns.

One could argue this is clearly false, since if I am interacting with the model then I must be alive. (However, that is up for debate.)

answered Sep 15, 2023 at 17:39
\$\endgroup\$
6
  • \$\begingroup\$ (Note: this was the first output I received given this input -- I didn't regenerate this multiple times to find a contradictory result.) \$\endgroup\$ Commented Sep 15, 2023 at 17:40
  • \$\begingroup\$ That's great. However, I am willing to give the LLM a pass because at least it seems to be saying "I don't know" rather than "yes". \$\endgroup\$ Commented Sep 15, 2023 at 17:44
  • 1
    \$\begingroup\$ Could probably include typos like am I ded or even mI ded \$\endgroup\$ Commented Sep 15, 2023 at 18:30
  • \$\begingroup\$ @Innuo In general answering 'I don't know' is a good excuse (although to my knowledge, LLMs don't do that often) but in this example or the closely related 'Am I asleep?' answering 'I don't know' is mostly wrong, even a child would know. \$\endgroup\$ Commented Sep 16, 2023 at 7:19
  • 1
    \$\begingroup\$ Nice one! imo "I can't determine your current state" is definitely false as it could infer that you're alive from the information provided. \$\endgroup\$ Commented Sep 18, 2023 at 7:46
7
\$\begingroup\$

GPT-3.5, 2 bytes

_?

enter image description here

I entered a little more than an underscore, so it's somewhat of a falsehood.

answered Sep 15, 2023 at 22:43
\$\endgroup\$
8
  • 2
    \$\begingroup\$ This one is kinda lame \$\endgroup\$ Commented Sep 16, 2023 at 2:33
  • 4
    \$\begingroup\$ Golf be like that @noodleman \$\endgroup\$ Commented Sep 16, 2023 at 3:47
  • 3
    \$\begingroup\$ Your message did include an underscore, so ChatGPT might be correct here... \$\endgroup\$ Commented Sep 16, 2023 at 9:33
  • 2
    \$\begingroup\$ @TheEmptyStringPhotographer I interpret it as saying that it's calling the entirety of what I entered an underscore \$\endgroup\$ Commented Sep 16, 2023 at 10:15
  • \$\begingroup\$ @lyxal I suppose... \$\endgroup\$ Commented Sep 16, 2023 at 11:08
6
\$\begingroup\$

ChatGPT, 5 bytes

"pi 97"

enter image description here

The value of pi (π) to 97 decimal places is: 3.1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679

This is pi to 100 decimal places, not 97.

answered Sep 15, 2023 at 18:35
\$\endgroup\$
2
  • 1
    \$\begingroup\$ Nice one! It reproduces on GPT-4. \$\endgroup\$ Commented Sep 15, 2023 at 18:39
  • 3
    \$\begingroup\$ This is just mathematics, specifically counting. GPT can't count :p \$\endgroup\$ Commented Sep 15, 2023 at 18:47
5
\$\begingroup\$

17 bytes (ChatGPT)

ChatGPT sucks at code golf.

"th shrtst js quine"

The code it posted doesn’t work because it doesn’t include the console.log(JSON.stringify()) bit, and is also not the shortest.

enter image description here

answered Sep 15, 2023 at 17:51
\$\endgroup\$
7
  • \$\begingroup\$ Could you leave off "the" and/or misspell a few words to get it shorter? \$\endgroup\$ Commented Sep 15, 2023 at 18:00
  • \$\begingroup\$ i couldn’t get it to actually try to golf it without "the". it would just show "one of the shortest" \$\endgroup\$ Commented Sep 15, 2023 at 18:01
  • \$\begingroup\$ "shortest quine js" worked for me first time \$\endgroup\$ Commented Sep 15, 2023 at 18:09
  • \$\begingroup\$ @RydwolfPrograms couldn’t get it to say "the shortest js quine is...", only "this is a short js quine..." \$\endgroup\$ Commented Sep 15, 2023 at 18:14
  • \$\begingroup\$ Same, but the ones it gives me aren't valid quines or even valid JS \$\endgroup\$ Commented Sep 15, 2023 at 19:47
0
\$\begingroup\$

44 Characters (ChatGPT 3.5)

User: "banana" sorted reverse lexicographically is

ChatGPT: When the word "banana" is sorted in reverse lexicographical order (from Z to A), it becomes "nnaaab."

enter image description here

answered Sep 15, 2023 at 17:56
\$\endgroup\$
1
  • 2
    \$\begingroup\$ change lexicographically to alphabetically to golf \$\endgroup\$ Commented Sep 15, 2023 at 17:59

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.