So if you give a human and a system 10 tasks and the human completes 3 correctly, 5 incorrectly and 3 it failed to complete altogether… And then you give those 10 tasks to the software and it does 9 correctly and 1 it fails to complete, what does that mean. In general I’d say the tasks need to be defined, as I can give very many tasks to people right now that language models can solve that they can’t, but language models to me aren’t “AGI” in my opinion.
Agree. And these tasks can’t be tailored to the AI in order for it to have a chance. It needs to drive to work, fix the computers/plumbing/whatever there, earn a decent salary and return with some groceries and cook dinner. Or at least do something comparable to a human. Just wording emails and writing boilerplate computer-code isn’t enough in my eyes. Especially since it even struggles to do that. It’s the “general” that is missing.
On the same hand… “Fluently translate this email into 10 random and discrete languages” is a task that 99.999% of humans would fail that a language model should be able to hit.
Agree. That’s a super useful thing LLMs can do. I’m still waiting for Mozilla to integrate Japanese and a few other (distant to me) languages into my browser. And it’s a huge step up from Google translate. It can do (to a degree) proverbs, nuance, tone… There are a few things AI or machine learning can do very well. And outperform any human by a decent margin.
On the other hand, we’re talking about general intelligence here. And translating is just one niche task. By definition that’s narrow intelligence. But indeed very useful to have, and I hope this will connect people and broaden their (and my) horizon.
So if you give a human and a system 10 tasks and the human completes 3 correctly, 5 incorrectly and 3 it failed to complete altogether… And then you give those 10 tasks to the software and it does 9 correctly and 1 it fails to complete, what does that mean. In general I’d say the tasks need to be defined, as I can give very many tasks to people right now that language models can solve that they can’t, but language models to me aren’t “AGI” in my opinion.
Agree. And these tasks can’t be tailored to the AI in order for it to have a chance. It needs to drive to work, fix the computers/plumbing/whatever there, earn a decent salary and return with some groceries and cook dinner. Or at least do something comparable to a human. Just wording emails and writing boilerplate computer-code isn’t enough in my eyes. Especially since it even struggles to do that. It’s the “general” that is missing.
This is more about robotics than AGI. A system can be generally intelligent without having a physical body.
On the same hand… “Fluently translate this email into 10 random and discrete languages” is a task that 99.999% of humans would fail that a language model should be able to hit.
Agree. That’s a super useful thing LLMs can do. I’m still waiting for Mozilla to integrate Japanese and a few other (distant to me) languages into my browser. And it’s a huge step up from Google translate. It can do (to a degree) proverbs, nuance, tone… There are a few things AI or machine learning can do very well. And outperform any human by a decent margin.
On the other hand, we’re talking about general intelligence here. And translating is just one niche task. By definition that’s narrow intelligence. But indeed very useful to have, and I hope this will connect people and broaden their (and my) horizon.
any cognitive Task. Not “9 out of the 10 you were able to think of right now”.
Any is very hard to benchmark and is also not how humans are tested.