Searcharoo.net ASP.NET Search Engine

Searcharoo.net: ASP.NET Search with C#

Skip Navigation Links
Home
Version 1
Version 2
Version 3
Version 4
Version 5
Version 6
Version 7
Links

  

Version 7
Display document excerpt (with keywords highlighted) on the results page, plus bugfixes. NEW! Mar '09
Version 6
Index JPG images, index GPS location data for mapping results, address "No" Trust problem and fix a few bugs.
Version 5
Remove Binary Serialization to solve Medium Trust problem; index OpenXML document formats.
Version 4
Refactored codebase and ability to index and search Microsoft Word, Excel, PowerPoint and Acrobat PDFs. Little improvements like robots.txt and excluding regions of HTML also added.
Version 3
Adds a "save to disk" for the catalog; feature suggestions, bug fixes and incorporation of code contributed by others from previous versions.
Version 2
Extend Searcharoo to populate its search catalog by Spidering HTML pages - follow links and imagemaps to process both static and dynamicly generated pages! You can also search for multiple words.
Version 1
How to build a simple, extensible search engine using ASP.NET that can crawl files and create a searchable catalog by processing the text from HTML source.
Display Pagerank
Locations of visitors to this page
If you like Searcharoo and wish
to support future open source
development, consider donating
Donate via PayPal

Searcharoo.net is an open-source C#/ACP.NET implementation of a search engine that you can download and use on your website. Pick the most recent version from the menu and look for a download link.

The default interface should be familiar (and is easily customizable in ASPX/HTML, jQuery/AJAX or Silverlight 2.0)!

The results can show not only the text, but geo-location information (and urls that open in Google Earth):
Overview of version 6: Image search, Google Earth and Google Maps
The articles describe how the engine itself is built, from a simple file-system crawler to a fully-fledged web-spider. You can comment or ask questions on CodeProject.

In addition to information on this website, these search-related links might be interesting/useful.

Web search technology is a huge subject, encompassing:

  • networking (spidering the web),
  • string and markup-language manipulation (parsing HTML)
  • proprietary file formats (searching Word, Excel, PDF, etc)
  • language and text-parsing (finding words & sentences in documents, stemming and other linguistic analysis),
  • algorithms (finding matches, AND/OR queries, combining multiple word results)
  • performance (both increasing spidering speed, and making large catalogs fast to search)
  • user interface (presenting search input options, and results)

and I would encourage you to read as much as you can about these subjects and modify Searcharoo for your own specific purpose.

New in version 7.0

Search in Silverlight

Search using jQuery and JSON

Useful links

conceptdevelopment.net

Craig's Blog
 Linqaroo - Linq for Searcharoo

On Search, the Series

dotLucene [Open Source]

SiteSearchEngine [VB.net article]

What is Stemming?

Robots.txt

more links »