An overwhelming majority of what constitutes the reinforcement learning with function approximation literature studies settings in which the agent has direct access to the latent state of the system. In this work, we study reinforcement learning under more realistic conditions, where the learner operates on rich, high-dimensional observations, but the underlying (``latent') dynamics are comparatively simple. We investigate the statistical requirements and algorithmic principles for reinforcement learning under general latent dynamics, and propose a modular framework for algorithm design that involves first learning a representation that decodes the latent state, then applying an RL algorithm for the latent dynamics based on this representation.