Deep neural networks provide effective solutions to small-footprint keyword spotting (KWS). However, most of the KWS methods take softmax with the minimum cross-entropy as the loss function, which focuses only on maximizing the classification accuracy on the training set, without taking unseen sounds that are out of the training data into account. If training data is limited, it remains challenging to achieve robust and highly accurate KWS in real-world scenarios where the unseen sounds are frequently encountered. In this paper, we propose a new KWS method, which consists of a novel loss function, named the maximization of the area under the receiver-operating-characteristic curve (AUC), and a confidence-based decision method. The proposed KWS method not only maintains high keywords classification accuracy, but is also robust to the unseen sounds. Experimental results on the Google Speech Commands dataset v1 and v2 show that our method achieves state-of-the-art performance in terms of most evaluation metrics.