Humans learning a complex task are picky and sticky
Tiago Quendera, Zachary F. Mainen, Champalimaud Foundation, Portugal; Dongrui Deng, Xi’an Jiaotong University, China; Mani Hamidi, University of Tubingen, Germany; Mattia Bergomi, Veos Digital, Italy; Gautam Agarwal, Claremont Colleges, United States
Session:
Posters 3 Poster
Location:
Pacific Ballroom H-O
Presentation Time:
Sat, 27 Aug, 19:30 - 21:30 Pacific Time (UTC -8)
Abstract:
While neural networks approach human levels of perfor- mance in many complex tasks, they require much more training than humans. This may be because only humans can infer and apply generalizable principles from prior ex- perience (Lake, Ullman, Tenenbaum, & Gershman, 2017). However, the statistics that underlie the human learning process are poorly understood and hard to investigate in the large state spaces found in most complex tasks (van Opheusden & Ma, 2019). We thus designed a cog- nitive task whose potential solutions are few enough for subjects to densely sample policy space, but complex enough to compel intelligent search. We launched the game as a smartphone-based app (hexxed.io) to collect data from 10k human participants. We find that unlike re- inforcement learning agents (Deep-Q Networks ; DQNs), humans (1) search a highly restricted subset of the pol- icy space; (2) attempt even poor solutions many times before discarding them; (3) arrive at the optimal policy suddenly and unpredictably with a “leap of insight”. Our data suggest a “top-down” learning process by which hu- mans propose explanatory solutions which they replace only upon collecting sufficient evidence to the contrary, in contrast to the “bottom-up” learning of DQNs that as- sociates states with rewarding actions