abstract: Most living organisms need to find and exploit resources in their environment for survival. Understanding how these tasks can be performed efficiently and with little computation remains a challenge in ecology and cognitive sciences. We consider a mathematically tractable model where a random walk or L\'evy flight visits sites characterized by quenched waiting times (“resources”), with a preference for revisiting sites that have been often occupied in the past. This model, which is inspired from diffusion with stochastic resetting, allows us to show how a single process with memory can learn by reinforcement about salient spatial features of an environment. With just one impurity site, as soon as the rate of memory use crosses a critical threshold, diffusion can be suppressed altogether and the walker becomes localized around the impurity. This phase transition belongs to the same universality class as the self-consistent theory of Anderson localisation, and its properties depend mainly on the standard no-return probability of simple random walks on lattices. We further discuss possible applications of non-Markovian search algorithms inspired from biology to solve hard problems in computation and other areas of science.