Neural speech codec is a crucial component in generative tasks such as speech resynthesis and zero-shot TTS. However, most works exhibit degraded performance with fewer tokens due to low coding efficiency in modeling complex coupled information. In this paper, we propose a self-supervised disentangled neural speech codec named FreeCodec. It employs distinct frame-level encoders to decompose intrinsic speech properties into separate components and adopts enhanced decoders to reconstruct speech signals. By encoding and quantizing the different frame-level information with dedicated quantizers, FreeCodec gets higher encoding efficiency with 57 tokens. Furthermore, our proposed method can be applied flexibly in reconstruction and disentanglement scenarios with different training strategies. Subjective and objective experimental results demonstrate that our framework outperforms existing methods in both reconstruction and disentanglement tasks.