We consider the communication of a source-destination pair in a dynamic flat fading channel environment. The communication is aided by mobile relays, which beamform the source signal to the destination. The beamforming weights and the relay positions are optimally selected so that the average Signal-to-Interference and Noise ratio (SINR) at the destination is maximized subject to total relay power constraints. We assume a time slotted system, and model the channels as random fields with a deterministic path loss component with unknown exponent, and random log-normal shadowing and multipath fading components with unknown model parameters. In the beginning of each slot, the relays optimally beamform based on source and destination channel estimates obtained up to their current positions and current time. Then, for the remainder of the slot, the relays optimally select their positions for the next slot beamforming. The relay positions are computed using a reinforcement learning approach, which recursively estimates the channel map in time and space.