Hyperspectral images (HSI) are renowned for their high spectral resolution and extensive wavelength coverage, but they often suffer from limited spatial and temporal resolution due to imaging sensor constraints. This may make it difficult for hyperspectral images to play a role in the acquisition of fine surface information and the observation of continuity of time scales. Spatial-temporal-spectral fusion (STSF) aims to synthesize the temporal, spatial, and spectral information from multisource remote sensing images to reconstruct hyperspectral images with high spatial and temporal resolution. However most existing STSF methods are still limited to the assumption of linear spatial temporal and spectral relationships. Furthermore, the STSF methods based on Landsat and MODIS cannot directly process the current hyperspectral data which has a lower temporal resolution. This paper proposed an unsupervised spatial-temporal-spectral fusion model for hyperspectral images using a global shared convolutional neural network (UGSCNN). The proposed method has two models: 1) Spatial-spectral down model: combine spectral unmixing theory with deep learning to down sample the space and the spectrum; 2) Spectral up model: utilize the shared global information to up sample the spectrum. To verify the proposed method, we compared it with others using simulated and real on-board data, proving its effectiveness and practical fusion results.