This is a dataset of videos for wooden box assembly. The main strength of this dataset is the design of standard and uniform workflow and the use of multiple cameras capturing videos from different angles. A total of 62 videos of 17 subjects were collected. The duration of the videos is 13.0 hours. The whole workflow is designed into nine steps. The dataset contains the videos of the assembly process and the temporal annotation data for the nine steps in each video. Our dataset could be used to facilitate the studies in different applications such as object recognition, human action classification, intelligent automation etc.