In this paper, we introduce three methods to enhance the state-of-the-art ECAPA-TDNN model for speaker verification, namely self-calibration (SC), simple attention mechanism (SimAM), and a modified temporal dynamic convolution (MTDY) based front-end module. The SC module expands the model’s receptive field and improves spatial attention for better capture of contextual information. The SimAM attention mechanism assigns unique weights to individual neurons, so it can place greater emphasis on more informative ones. The MDTY-based front-end module adapts itself to diverse temporal speech features with adaptive convolutional kernels, and aggregates these kernels to capture temporal variations with attention weights. Our proposed model, IM ECAPA MTDY-TDNN SimAM, demonstrates improved performance and complexity trade-offs compared to recent research works. On the VoxCeleb1-H test set, it achieves a 1.655% EER and 0.157 minDCF with 9.71M parameters and 1.97G FLOPs.