In task-oriented dialog systems, slot filling aims to identify the semantic slot type of each token in utterances. Due to the lack of sufficient supervised data in many scenarios, it is necessary to transfer knowledge by using cross-domain slot filling. Previous studies focus on building the relationships among similar slots across domains by providing additional descriptions, yet not fully utilizing prior information. In this study, we mainly make two novel improvements. First, we improve the hierarchical frameworks based on pre-trained models. For instance, we add domain descriptions to auxiliary information in the similarity layer to enhance the relationships. Second, we improve the independent fine-tuning with multi-task learning by using an auxiliary network, where the domain detection task is deliberately set up corresponding to the domain descriptions. Additionally, we also adopt an adversarial regularization to avoid over-fitting. Experimental results on SNIPS dataset show that our model significantly outperforms the best baseline by 16.11%, 11.06% and 8.77%, respectively in settings of 0-shot, 20-shot and 50-shot in terms of micro F1, which demonstrates our model has better generalization ability, especially for domain-specific slots.