What the Google Mini will spider

January 10, 2006 in Google Mini,Spidering | Comments (2)

Google have a helpful list of the file formats the Google Mini will spider.

It’s worth checking what you want to spider before you consider the other factors in why you are buying a search appliance. Checking through the various file formats I have, I was surprised to see the Mini supports .wps files written by Microsoft Works for DOS. It’s not a difficult format to read, but it is an old format now – Works v2 being copywrite 1988 if my memory serves me correctly. Personally I have a ton of old Works files and it’s nice to know something will still understand them, I told Google Desktop they were txt files with an odd extension, but that can have dubious results as some of the file is binary.

Comments (2)

RSS feed for comments on this post.

  1. Comment by C.H. Van — January 21, 2006 @ 6:52 am

    It’s my understanding that Google licenses a third party filter for its appliance (as almost all search indexing companies do). I don’t know which one Google licenses, but both Verity Keyview and Stellent OutsideIn support this format, along with lots of others.

  2. Pingback by GSA Developer » Blog Archive » You can’t spider XML with a Google Mini (so far) — January 16, 2007 @ 2:06 pm

    […] A question I’ve seen come up a lot which isn’t answered directly by my earlier post is whether the Google Mini or Search Appliance can spider raw XML. Unfortunately, no, it cannot. […]

Leave a comment

Sorry, the comment form is closed at this time.