Found this in a children’s book of riddles:
Six brothers were spending their time together.
The first brother was reading a book.
The second brother was playing chess.
The third brother was solving a crossword.
The fourth brother was watering the lawn.
The fifth brother was drawing a picture.
Question: what was the sixth brother doing?
I cant get ChatGPT to answer correctly with the usual tricks, even after hinting to consider one and two-person activities and emphasizing the word “together”.
After a bunch of CoT turns we arrive to a conclusion that this is an open ended question and not a riddle :)
After trying 3 times with fresh promots, I got a correct response once, but when prompted to provide supporting reasoning the model backtracked and started apologizing.
Cant test gpt 4 r/n…
Perhaps there’s a language barrier here, but none of those activities hint to a garden? In my locale, a garden is a small patch used to grow veggies, herbs, and/or flowers. So I would answer this with “their back yard.”
This is a much better riddle for children IMO, because it’s barely open-ended at all. The original has almost infinite answers without any leaps or tricks, but yours has a very limited domain: a yard/garden. Though if someone were extra clever, the problem space does open back to nearly infinity (if brother 4 is playing a video game).
For personal testing, that’s certainly a valid opinion! But it’s not very productive from an objective standpoint because it can’t be graded and tests a “gotcha” path of thinking, when we’re still focusing on fundamentals like uniform context attention, consistency over time, etc.