Graph Based Search Engine

Search Engine has become a major tool for searching any information from the World Wide Web (WWW). While searching the huge digital library available in the WWW, every effort is made to retrieve the most relevant results. But in WWW majority of the Web pages are in HTML format and there are no such tags which tells the crawler to find any specific domain. To find more relevant result we use Ontology for that particular domain. If we are working with multiple domains then we use multiple ontologies. Now in order to design a domain specific search engine for multiple domains, crawler must crawl through the domain specific Web pages in the WWW according to the predefined ontologies.

Internet is an infinite reservoir of information. It has bought the concept of ”Vasudeva Kutumbakam” to reality. To find information from the internet we needs a document retrieval system called search engine. A Web search engine mainly searches for the documents in the WWW. A Web crawler is a program that crawls through the WWW and returns the Web pages in its way, to search engine. After getting a predefined number of Web pages the crawler stops running. The search engine allows one to ask for content meeting specific criteria (typically those containing a given word or phrase). And searches the given word or phrase in the Web pages returned by the Web crawler. Then it retrieves a list of items that match those criteria. And produce a ranked list of URLs in which the keywords matched. Although such technologies are mostly used, users are still often faced with the daunting task of sifting through multiple pages of results, many of which are irrelevant.

In this paper, we discuss the basic idea of the graph based searching and describe a design and development methodology for multiple domain specific search engine based on multiple ontology matching and relevance limits which not only overcomes the problem of knowledge overhead but also supports conventional queries. Further, it is able to produce exact answer from the graph that satisfies user queries.