Blog#165: Comparing the reasoning skills of GPT-4 to GPT-3.5

The main goal of this article is to help you improve your English level. I will use Simple English to introduce to you the concepts related to software development and so on. In terms of knowledge, it might have been explained better and more clearly on the internet, but remember that the main target of this article is still to LEARN ENGLISH.


Hi, I'm Tuan, a Full-stack Web Developer from Tokyo 😊. Follow my blog to not miss out on useful and interesting articles in the future.

I got my hands on the new model and did some experiments.

When I logged into ChatGPT Plus, I was welcomed with a big smile and a friendly message saying "Hey there! OpenAI's new GPT-4 is so smart it can do all kinds of tricky thinking stuff!"

The greeting when I opened ChatGPT Plus today

When I clicked on the link, I saw some cool stuff about the three different models they had to offer: Legacy, Turbo (also known as Default) and GPT-4. It was like a comparison chart!

This is the original model released as ChatGPT

This was originally known as the “Turbo” model, but became the default based on user feedback. It’s more concise and much faster than the original model

This is the newest model: GPT-4. Here speed is sacrificed for “advanced reasoning, complex instruction understanding, and more creativity”

I was excited to compare the new Turbo model's intelligence to the old one and find out how much better it is!

I asked two models some questions to see how smart they were. The first one was a tricky one about family, the second was a riddle, and the third was like something a salesman would ask. Let's see if they can figure it out!

Here are the results:

Question #1: Wolf, Chicken, and Feed Riddle

The default model got this hilariously wrong.

GPT-4 got it right.

This riddle is easy for most people to solve, but GPT-3.5 gave a confusing answer. GPT-4, however, was able to solve the riddle correctly, giving the right steps in the right order.

Question #2: Traveling Salesman

GPT-3.5 used the Nearest Neighbor Algorithm. It got the right result with the algorithm, but this is not the actual shortest path for the salesman

I tried to force GPT-3.5 to brute-force the answer, but it still got it wrong.

GPT-4 successfully solved the traveling salesman problem for five cities.

Even though there were only five cities, there were 24 possible routes, making it an NP-hard problem. GPT-3.5 used the Nearest Neighbor Algorithm, which gave the wrong answer because it wasn't the shortest possible path. I asked it to use the brute-force approach, but it still gave the wrong answer.

GPT-4 was able to solve the traveling salesman problem by using a method called brute-force, which means it looked at all 24 possible routes and found the correct one.

Question #3: Family Relationships

The default model got this wrong and its answer is very confusing

GPT-4 got it wrong too, but at least its reasoning was better

I was so confused by this question that even the advanced artificial intelligence programs GPT-3.5 and GPT-4 couldn't get it right. However, the correct answer is that my two friends are related as first cousins once removed.

GPT-4 is still making mistakes, but they are much less noticeable than the mistakes made by GPT-3.5. It is amazing that this model can do so much with probability calculations.

I will look at how well GPT-4 can do coding tasks once I have some good tasks for it to do. I will let you know when I have done this.

And Finally

As always, I hope you enjoyed this article and learned something new. Thank you and see you in the next articles!

If you liked this article, please give me a like and subscribe to support me. Thank you. 😊


The main goal of this article is to help you improve your English level. I will use Simple English to introduce to you the concepts related to software development and so on. In terms of knowledge, it might have been explained better and more clearly on the internet, but remember that the main target of this article is still to LEARN ENGLISH.

Resource

NGUYỄN ANH TUẤN

Xin chào, mình là Tuấn, một kỹ sư phần mềm đang làm việc tại Tokyo. Đây là blog cá nhân nơi mình chia sẻ kiến thức và kinh nghiệm trong quá trình phát triển bản thân. Hy vọng blog sẽ là nguồn cảm hứng và động lực cho các bạn. Hãy cùng mình học hỏi và trưởng thành mỗi ngày nhé!

Đăng nhận xét

Mới hơn Cũ hơn