abstract: We introduce a derivative for neural networks with locally Lipschitz continuous activation functions that admits a compositional calculus. This allows the construction of neural networks with ReLU activation, globally (almost everywhere) approximating a Sobolev-regular function and its weak gradient. Subsequently we present hierarchical and compositional function classes that admit approximation in the above sense without curse of dimensionality and highlight applications to the numerical solution of partial differential equations.