Abstract:Previous works on graph convolutional networks based on temporal recurrent networks and frequency domains has shown impressive results in three-dimensional human motion prediction. However, the time domain and frequency domain are the manifestations of the same human action signal in different domains, and this paper encodes the observed movement sequence of the human body in the time and frequency domains in combination with the human posture in the time and frequency domains, and strengthens the interdependence between nodes of human bones through the attention mechanism of joint information of different manifestations in the two channels. Finally, the gated loop unit (G-GRU) based on the graph is used to recursively decode the encoded information and output the predicted motion sequence. We tested our model on the Human 3.6M and CMU-MoCap datasets, and experiments proved that our model can obtain more accurate predictions than previous methods.