QuickFinder
(webpage indexing and searching)
3/6/2006 QuickFinder is a quick and easy way to index all of your webpages, providing a search engine and returning search results. Netware 6.5 includes Read the QuickFinder
Server manual here.
A list of QuickFinder startup switches can be found in TID 10063970. One problem I quickly ran into was that I had shut off port 80 in the
apache conf file, thereby also blocking any use of QF, since by default
it is on port 80. It is also not obvious how to change that without
editing the apache conf file again. There are four external conf files
that are loaded by Apache with regards to quickfinder. These exist in
the SYS:/qfsearch/WEB-INF directory and include the following
files:
The file that provides the search engine configuration is QFSrchApache.conf.
If you open it and look, you will see a tag that reads "<Directory
"SYS:/qfsearch/docs">", which makes the main QF
docs directory available to Apache, and then an alias directive line,
"Alias /quickfinder "SYS:/qfsearch/docs"."
The easiest way to allow QF access without allowing port 80 is to set
up Apache to listen on another port, then set up a virtual directory
for QF (the virtual directory is required because you can't specify
a port in a <Directory> tag). Port 81 is already a redirect to
https: port 8009 (NoRM), so I decided to user port 82. I added the following
lines to QFSrchApache.conf:
After restarting Apache/Tomcat, I was able to access the quickfinder search page, which is normally http://ip.address/quickfinder, by going to http://ip.address:82. ????This does not work... opens up all the directories that would normally be listening on port 80 to now listen on port 82.
Now it's time to set up a search. There is a virtual server set up by default, containing two indexes - "QuickFinder Server" and "DocRoot." However, I wanted to set up a new one. Go into the QF Search manager at the URL I mentioned up above. If you only see the "Indexing" and "Settings" sections in the left-hand menu, it's because you are already in the settings for a virtual server. I'm embarassed to say that it actually took me a while to figure this out. You must click on the Home icon up top, which will bring you to a page containing the "Virtual Search Servers," "Default Settings," and "Service Settings" sections on the left-hand menu. Click on Virtual Search Servers -> Add. I entered the following
settings for mine:
They tell you to use a DNS name or IP for the name of the virtual server, but because the site that I want searched is in a subdirectory (www.datastat.com/sysadminjournal), and I did not know how it would react to a slash in the name, I used the relative name. Click Add. It will take a minute to create the directory, but if you look in the filesystem at the specified location (i.e. SYS:/qfsearch/Sites/sysadminjournal (since I used the default)), you will see that it created the directory and put in one single file named ReportTemplate.html. Most of the default settings are OK, but one that should definitely be noted is "URLs are case sensitive," under Default Settings -> Index. The default is No, but I changed this to Yes, since of course 99% of URL's are case sensitive (including ALL of mine). Note that I also kept the "Maximum depth of off-site URLs" set at 0, even though the page calls this "Unlimited" (I do not want any off-site URL's indexed). This is deceiving - when you go to the configuration on an actual index, you will see that 0 actually tells it not to do any off-site indexing. I also set "Maximum time to download a URL" to 10 seconds.
Now, you must add at least one index to
your site (your Virtual Server, that is). Click on Virtual
Search Servers -> List, then click the "Manage"
button next to the site you just created. It should then say "No
indexes are currently defined." In the "Define a New Index"
box, select "New crawled index" and then click the
"Define Index" button. Before entering any values,
click on "Advanced Settings." At the next screen, I
entered the following values:
If you wanted to add more websites to crawl and be included in the same index, you could add them here as well. Click on Apply Settings. If everything is good, it will say "Index ... was successfully defined! If you go into the site's directory, you will now see that it added the following files: General.properties, qfind.cfg, qfind.cfg.bk1, and a logs directory. You must now schedule an index maintenance event so that your index will be populated on some sort of regular basis. Click on Indexing -> Scheduling. There should be no events scheduled yet. Click on "Add Event." The default is to regenerate all collections, every morning at 2:00am. I changed the time to 5:00am, but kept all other settings the same. Update does an incremental index, while Optimize will merge those incremental indexes in. Click on Apply Settings. A list of events that includes the one you just created should now appear. Unless you want to wait a while for your site to be indexed, you will want to run it manually. Click on Indexing -> Management from the left-hand menu, where you will see a list of your indexes for this Virtual Server. If you just created one, it should say "Index type: Crawled" but also "Status: Not yet indexed." Click on the Generate button. next to that index. It should now tell you that it is indexing. My site is small, so it finished (and refreshed the page) in just a couple seconds. Voila!
Let's try a search using that new index. From the "Index Management" page, click on "Display Search Page" at the bottom of the left-hand menu. This link does not work for me, since I had to redefine the port. I added the port though, and it worked fine (the URL for the search page using the Virtual Server created above is http://s3.datastat.com:82/qfsearch/SearchServlet?site=sysadminjournal). Give it a shot. If it works, you're done!
To test quickfinder using the default search page, enter http://ip.address/qfsearch/SearchServlet? into your browser. ???Because of the port that I added in Apache up above, I can actually go to http://ip.address:82/SearchServlet? http://svr1.your-domain-name.com/oneNet/NetStorage will work even though it's unsecure. Good for testing when turning off port 80! |