Named entity recognition is a critical task in the natural language processing field. Most existing methods for this task can only exploit contextual information within a sentence. However, their performance on recognizing entities in limited or ambiguous sentence-level contexts is usually unsatisfactory. Fortunately, other sentences in the same document can provide supplementary document-level contexts to help recognize these entities. In addition, words themselves contain word-level contextual information since they usually have different preferences of entity type and relative position from named entities. In this paper, we propose a unified framework to incorporate multi-level contexts for named entity recognition. We use TagLM as our basic model to capture sentence-level contexts. To incorporate document-level contexts, we propose to capture interactions between sentences via a multi-head self attention network. To mine word-level contexts, we propose an auxiliary task to predict the type of each word to capture its type preference. We jointly train our model in entity recognition and the auxiliary classification task via multi-task learning. The experimental results on several benchmark datasets validate the effectiveness of our method.