Tuesday, September 30, 2008

Making Sensible Queries

Google Web Services can help you perform a number of tasks. The problem is that each request and response consumes resources. To get the most from this Web service, you need to optimize the requests and responses so that the value of the information you receive exceeds the cost of transmitting and manipulating the data.

Creating a request and then handling the response has several costs associated with it. Some of the costs are real world in that you must provide the infrastructure required to perform the task. Inefficient queries could mean adding additional bandwidth capacity or providing additional servers (if you make enough queries). Some costs are employee related—inefficient queries mean more waiting time as the computer crunches the data. Finally, inefficient queries can incur intangible costs. For example, people can become frustrated with poor query results, which affects their performance. Some of these costs are impossible to measure accurately, but they're real.

I often rely on the online search engine to help tune queries. Using the Advanced Search (http://www.google.com/advanced_search/) page shown in Figure 1.1 can help you define and tune searches to obtain maximum data with minimal resource use. For example, computer technology is quickly outdated, so I normally provide a date range as part of my search. Using the online search to customize the date range for specific keywords can greatly enhance performance.

The Advanced Search page can help you tune keyword order—making it possible to reduce the number of keyword permutations you use on an expanded search (see the "Conducting an Expansion Search" section of the chapter for details). It also helps you decide on how to use permanent keywords. For example, a site that sells a specific product might include the product name as a permanent search term—one that is always included even if the user doesn't specify it.

One of the features the kit provides is a more complete list of special phrases and characters you can use for a search. Although the Advanced Search page will help you ferret out many of these search features, you won't find them all. For example, the Advanced Search page includes a blank for a search phrase, which is different from a keyword in that the search phrase must appear as specified on the target page (keywords can appear in any order). Fortunately, all of the special keywords and characters mentioned in the kit also work on the Advanced Search page so you can try them out. (Chapter 2 discusses search techniques in detail.) Depending on which special features you use for a search, Google Web Services output might not provide the information you imagined. Consequently, it pays to try these special features out to see what effect they have on your search results.

Tip You might wonder why I'm suggesting such heavy use of the Advanced Search page. Google allows you to make 1,000 requests per day using Google Web Services. The Advanced Search page doesn't have such a limit—making it easier to keep testing search techniques until you find the technique you want to use in your code. At that point, you can start making requests from Google Web Services to test your code. Don't waste calls on search techniques.

Even when you create a perfect search and properly filter the results using code, the information you receive from Google Web Services might not fulfill every need. At some point, you need to perform some level of human filtering. Users will need to state a preference or define how well a particular search result works. Only by tuning the filter can you hope to obtain specific results from Google Web Services. Tuning makes it possible to reduce search times from hours to minutes.

