Neural network language models (NNLM) have become an increasingly popular choice for large vocabulary continuous speech recognition (LVCSR) tasks, due to their inherent generalisation and discriminative power. This paper present two techniques to improve performance of standard NNLMs. First, the form of NNLM is modelled by introduction an additional output layer node to model the probability mass of out-of-shortlist (OOS) words. An associated probability normalisation scheme is explicitly derived. Second, a novel NNLM adaptation method using a cascaded network is proposed. Consistent WER reductions were obtained on a state-of-the-art Arabic LVCSR task over conventional NNLMs. Further performance gains were also observed after NNLM adaptation.