abstract: Bell's theorem famously shows that no local theory can account for the predictions of quantum mechanics; while the Kochen-Specker theorem shows the same for non-contextual theories. Non-locality, and increasingly also contextuality, play an important role as computational resources in current work on quantum information. Much has been written on these matters, but there is surprisingly little unanimity even on basic definitions or the inter-relationships among the various concepts and results. We use the mathematical language of sheaves and monads to give a very general and mathematically robust description of the behaviour of systems in which one or more measurements can be selected, and one or more outcomes observed. In particular, we give a unified account of contextuality and non-locality in this setting. - A central result is that an empirical model can be extended to all sets of measurements if and only if it can be realized by a factorizable hidden-variable model, where factorizability subsumes both non-contextuality and Bell locality. Thus the existence of incompatible measurements is the essential ingredient in non-local and contextual behavior in quantum mechanics. - We give hidden-variable-free proofs of Bell style theorems. - We identify a notion of strong contextuality, with surprising separations between non-local models: Hardy is not strongly contextual, GHZ is. - We interpret Kochen-Specker as a generic (model-independent) strong contextuality result. - We give general combinatorial and graph-theoretic conditions, independent of Hilbert space, for such results.