Kad is one of the most popular peer-to-peer (P2P) networks deployed on today’s Internet. It provides support for file-sharing applications such as eMule and aMule and serves millions of users. Its reliability impacts not only the availability of file-sharing services, but also the capability of supporting other Internet services. However, in today’s Kad network, its lookup operation’s success ratio is lower than 91 % and not suitable for critical applications. In this paper, we investigate why Kad lookup fails and propose several new solutions. We build a measurement system called Anthill to analyze Kad’s communication process quantitatively, and figure out that the causes of Kad’s lookup failures can be classified into four categories: packet loss, selective Denial of Service nodes, search sequence miss, and publish/search space miss. The first two are due to the environment changes, the third is caused by the detachment of routing operations and content operations in Kad, and the last one shows the limitations of the Kademlia DHT algorithm under Kad’s current configuration. Based on the analysis, we propose corresponding approaches for Kad, including packet-retransmission, neighborhood lookup, and β-adjusting. We have systematically measured the effectiveness and efficiency of these approaches, and then give several recommendations for adoption in different situations. The improved version of Kad can achieve a success ratio of 99.8 % for lookup operations, with only a moderate communication overhead, while its average lookup latency is reduced significantly to only about 1 second. Our work shows that, with proper configurations and improvements, Kad can work much better and is capable of supporting more Internet services.