There is an important research direction of information retrieval technology for accurately judging the relations between the web pages and the user's requirement. In this paper, a semantic information retrieval algorithm based on web page segment is proposed. The key idea is to segment each web page into different topic areas or segments according to its HTML tags and contents since web pages are semi-structure. First the algorithm builds a HTML tag tree. Then it combines nodes in the tree by using both the content similarity and visual similarity. The retrieval and ranking algorithm makes use of this segmentation information to search and order the relevant pages. Experiment results show that this method is able to improve the search precision significantly.