A. Basic FAQ
- What is a Google Search Appliance?
- How is the appliance different from the public Google search?
- Is every web page published at UQ indexed in the search appliance?
- How do I add a page to the Google’s index? How can I remove pages?
- How many documents are in the UQ GSA license?
- Will the appliance crawler follow the URLs found within non-html documents?
- What file formats can the appliance index?
- How are ranking and relevancy determined?
- Can the GSA index documents in Microsoft Sharepoint?
- Where I get more information on the Google Search Appliance?
1. What is a Google Search Appliance?
The Google Search Appliance (GSA) is a stand-alone server that spends its time crawling through UQ websites and indexing almost every type of document it finds. The appliance is a custom built server containing the Google search software that is run and administered locally. Customised, relevant results from the GSA are available for searching by users looking to find content on the websites.
2. How is the appliance different from the public Google search?
The GSA provides continuous crawling, while the free syndicated search crawls just once a month. Only UQ domain (*.uq.edu.au) and some UQ related websites are included in the search index. We can customise the returns to allow for more relevant returns than is provided in the public search. Most of the features and syntax from Google.com is incorporated into the search appliance.
3. Is every web page published at UQ indexed in the search appliance?
No. There are a number of web pages which are not indexed on the appliance. For example: pages that create "black holes" (calendars and forum links that go on forever); pages that violate our license agreement; pages that are not public-facing; and others.
4. How do I add a page to the Google's index? How can I remove pages?
There is no need to submit your site for indexing. The GSA continuously crawls UQ websites so adding pages to the search index happens automatically. Google can find content linked from a page in the index. We generally only remove content from the search index that: poses a security risk, is factually incorrect, violates copyright or other laws, or is a subset of content not marked for indexing. In rare occasions, the GSA will not be able to find your content on its own. If your content isn't being searched, please let us know.
5. How many documents are in the UQ GSA license?
The UQ appliance server is capable of licensing up to 3 million documents. UQ has licensed the search appliance for 1 million documents.
6. Will the appliance crawler follow the URLs found within non-html documents?
Yes. The GSA will follow links contained on XML, PDF and Flash documents, but not Microsoft Office documents.
7. What file formats can the appliance index?
Yes. The appliance supports over 220 file formats. The formats supported are: HTML; PDF; text; Microsoft Office; and many more.
8. How are ranking and relevancy determined?
The Google Search Appliance uses more than 100 factors to determine the ranking and relevancy of search results using Hypertext-Matching Analysis. PageRank Technology is used to look at the relationship of the links themselves within the sites.
9. Can the GSA index documents in Microsoft Sharepoint?
Yes. The Google Search Appliance can index content in Microsoft SharePoint.
10. Where I get more information on the Google Search Appliance?
Check out the Google Search Appliance FAQ here.


