Wapo journalist verifies that robotaxis fail to stop for pedestrians in marked crosswalk 7 out of 10 times. Waymo admitted that it follows “social norms” rather than laws.
The reason is likely to compete with Uber, 🤦
Wapo article: https://www.washingtonpost.com/technology/2024/12/30/waymo-pedestrians-robotaxi-crosswalks/
Cross-posted from: https://mastodon.uno/users/rivoluzioneurbanamobilita/statuses/113746178244368036
People, and especially journalists, need to get this idea of robots as perfectly logical computer code out of their heads. These aren’t Asimov’s robots we’re dealing with. Journalists still cling to the idea that all computers are hard-coded. You still sometimes see people navel-gazing on self-driving cars, working the trolley problem. “Should a car veer into oncoming traffic to avoid hitting a child crossing the road?” The authors imagine that the creators of these machines hand-code every scenario, like a long series of if statements.
But that’s just not how these things are made. They are not programmed; they are trained. In the case of self-driving cars, they are simply given a bunch of video footage and radar records, and the accompanying driver inputs in response to those conditions. Then they try to map the radar and camera inputs to whatever the human drivers did. And they train the AI to do that.
This behavior isn’t at all surprising. Self-driving cars, like any similar AI system, are not hard coded, coldly logical machines. They are trained off us, off our responses, and they exhibit all of the mistakes and errors we make. The reason waymo cars don’t stop at crosswalks is because human drivers don’t stop at crosswalks. The machine is simply copying us.
I think the reason non-tech people find this so difficult to comprehend is the poor understanding of what problems are easy for (classically programmed) computers to solve versus ones that are hard.
if ( person_at_crossing ) then { stop }
To the layperson it makes sense that self-driving cars should be programmed this way. Aftter all, this is a trivial problem for a human to solve. Just look, and if there is a person you stop. Easy peasy.
But for a computer, how do you know? What is a ‘person’? What is a ‘crossing’? How do we know if the person is ‘at/on’ the crossing as opposed to simply near it or passing by?
To me it’s this disconnect between the common understanding of computer capability and the reality that causes the misconception.
I think you could liken it to training a young driver who doesn’t share a language with you. You can demonstrate the behavior you want once or twice, but unless all of the observations demonstrate the behavior you want, you can’t say “yes, we specifically told it to do that”