Most existing studies that develop fault diagnosis methods focus on performance under steady operation while overlooking adaptability under varying working conditions. This results in the low generalization of the fault diagnosis methods. In this study, a novel deep transfer learning architecture is proposed for fault diagnosis under varying working conditions. A modified capsule network is developed by combining the domain adversarial framework and classical capsule network to simultaneously recognize the machinery fault and working conditions. The novelty of the proposed architecture mainly lies in the integration of the domain adversarial mechanism and capsule network. The idea of the domain adversarial mechanism is exploited in transfer learning, which can achieve a promising performance in cross-condition fault diagnosis tasks. With the novel architecture, learned features exhibit identical or very similar distributions in the source and target domains. Hence, the deep learning architecture trained in one working condition can be applicable to discriminative conditions without being hindered by the shift between the two domains. The proposed method is applied to analyze vibrations of a bearing system acquired under different working conditions, i.e., loads and rolling speed. The experimental results indicate that the proposed method outperforms other state-of-the-art methods in fault diagnosis under varying working conditions.