Loading...
Flash Player 9 (or above) is needed to view slideshows. We have detected that you do not have it on your computer.To install it, go here
Microsoft Enterprise Seach using SharePoint
17469 views | comments | 3 favorites | 371 downloads | 0 embeds (Stats)
More Info
This slideshow is Public
Total Views: 17469 on Slideshare: 17469 from embeds: 0
Slideshow Transcript
- Slide 1: Microsoft Office SharePoint
Server 2007
Search Workshop
游家德 Jade Yu
敦群數位科技股份有限公司
- Slide 2: Microsoft Office SharePoint
Server 2007 Enterprise Search
Enterprise Search Advanced Training –
Building and Implementing Enterprise
Search Solutions
- Slide 3: Workshop Agenda
Day 1 – Search Overview
Microsoft Search Landscape
MOSS 2007 Walkthrough
Architecture and Deployment
Scenarios
Crawl and Query Processes
Search Object Model
Day 2 – Customization and Management
Search Object Model
Business Data Catalog (BDC) Search
Extensibility and Integration
Administration
Capacity Planning
- Slide 4: Assumptions
Some knowledge and experience with Search
functionality
Knowledge of the Business Data Catalog in
general (new in Office 2007 System)
Office 2007 System Content
Creation/Contribution experience
Knowledge of Web site creation and
management in general
Knowledge of MS platform (Windows 2003
Server, ADS, IIS, SQL 2005 & Office Clients)
Knowledge of ASP.NET 2.0 and XSLT
- Slide 5: Workshop Objectives
Explain how to use the Office 2007 Search
functionality
Interpret the Office 2007 System Search
Terminology
Describe the rich feature set of Office 2007
System Search - Servers and Clients
Describe how to use the platform well enough
to use its APIs to extend the products
Explain how Office 2007 System Search will
solve enterprise business requirements
- Slide 6: Module 1
Enterprise Search Overview
- Slide 7: Module Agenda
Microsoft Enterprise Search
Client-side Search Platform
Client-side Comparison
Server-side Search Platform
Key Differences between WSS and MOSS
MOSS 2007 for Search Key Features
MOSS 2007 for Search and MOSS 2007
Comparison
- Slide 8: Microsoft Enterprise Search
Client-Side Search Platform
Server-Side Search Platform
Line-of-business systems and Documents, programs,
structured data sources and media
Unstructured information E-mail messages,
appointments, and
instant messaging
People, expertise
External Web sites
- Slide 9: Client-Side Search Platform
Windows Desktop Search (WDS) for
XP and Windows Server
You must install an additional program for
Search
Vista – Integrated Desktop Search
Integration in the Operating System
Ability to search nearly anywhere
Virtual Folders
- Slide 10: Client-Side Comparison
Microsoft®
Microsoft®
Windows® Desktop
Windows® Vista
Search
Rich, actionable interface X X
Integration with Microsoft Outlook X X
Polite indexing
X
(Pauses when computer is in use) X
Live icons & document previews X
X
Advanced Search integrated into the
Operating System X
Save searches to search folders X
X X
Instant Search
(on taskbar) (from start menu)
- Slide 11: Server-Side Search
Platforms
Windows SharePoint Services v3
“Basic” index / search capabilities to
support WSS collaboration and document
management
Microsoft Office SharePoint Server
(MOSS) 2007
Enterprise search and indexing features
“unlocked”
Several SKUs to support different
scenarios and customer needs
- Slide 12: Key Differences Between WSS and MOSS
WSS v3 Microsoft Office SharePoint Server (MOSS)
XSharePoint sites / collections, Exchange
Local SharePoint
Can Index Public Folders, File Shares, Web Content,
content
Lotus Notes, LOB Apps, and others . . .
Rich, relevant results
X
Alerts, RSS, Did you
mean, Duplicate
X
collapsing
Scopes, Managed
X
Properties
Best Bets, Result
Removal, Query Reports X
Search Center Tabs X
BDC Search X
API’s provided Query Query + Admin
- Slide 13: MOSS 2007 for Search
A Search-only solution for intranets and
public-facing Web (Internet) sites
Two versions
Standard Edition limited to 500,000 docs
Enterprise Edition with unlimited docs
Includes
Out of the box search for file shares, Web sites,
SharePoint sites, Exchange Public Folders, Lotus
Notes databases
Extensibility to 3rd party document repositories
and file types
- Slide 14: MOSS 2007 and MOSS FS
Usage Scenarios
Description Scenario
MOSS 2007 An information management solution Customers who desire
that includes enterprise search search as an integrated
integrated with portal, collaboration, part of a broader
web content management, ECM, information management
forms, and BI functionalities solution
•Customers who require
MOSS FS A core search-only solution for
intranet and public-facing web sites a core search-only
product that can be
integrated into their
existing infrastructure
•Customers who require
search functionality for
their public-facing web
(Internet) sites
- Slide 15: MOSS 2007 for Search and MOSS 2007
Features Comparison
Features MOSS 2007 MOSS 2007 MOSS 2007 MOSS 2007
for Search for Search (Standard CAL) (Standard plus
(Standard Edition) (Enterprise Edition) Enterprise CAL)
File shares X X X X
Web sites X X X X
SharePoint sites X X X X
Microsoft Exchange Server
X X X X
public folders
Lotus Notes databases X X X X
Third party document
X X X X
repositories 1
Secure content access control
X X X X
Enhanced Search Center user
X X
interface
Search for people and
X X
expertise
Business Data Catalog (BDC)
X
Search structured data sources
X
Document limit 500,000 No Limit2 No Limit2 No Limit2
- Slide 16: Questions?
- Slide 17: Module 2
Microsoft Office SharePoint
Search 2007 –
Walkthrough
- Slide 18: Module Agenda
End-User Improvements
Relevance
People and Expertise
Business Data Search
Administration Improvements
Design Goals
Indexing Management
Security
Customization
Query Reporting
Performance Improvements
Demo MOSS 2007
- Slide 19: End-User Improvements
Relevance
Dramatically improved relevance
is the top goal of this release
New ingredients added including:
Anchor text
Click distance
URL depth
Missing metadata creation
Result is noticeably more relevant search
100% better on all queries
500% better on common queries
- Slide 20: End-User Improvements
People and Expertise
Bring people into the Search experience
Getting your job done means working with
the right people
Find subject-matter experts based on their
knowledge and contacts
Numerous improvements over SPS 2003
Index any LDAP V3 directory
Dedicated tab for finding people
Results grouped by “social distance” to you
- Slide 21: End-User Improvements
Business Data Search
Information in Line of Business (LOB) systems is often
hard to access
MOSS 2007 can bring that data to your users
Data is accessed through the Business Data
Catalog
Exposed to many features in SharePoint
Search can easily index the data
No need to write code
Highly customizable results
Integrated with scopes and Search center
- Slide 22: Administration Improvements
Design Goals
Address SPS 2003 administration user
interface pain points
Unify WSS and MOSS search
Enable full programmability via the object
model
Even better scalability and performance
- Slide 23: Administration Improvements
Indexing Management
Streamlined experience and more control
One index per shared service; no need to
worry about managing discrete indexes
Multiple start addresses per content source
MOSS indexes can drive the WSS search
experience
Allow upgrade from WSS to MOSS
- Slide 24: Administration Improvements
Security
Query-time security trimming in SPS 2003
File shares, WSS/SPS 2003, Exchange, Lotus Notes
(via mapping)
Now supports pluggable authentication
for content in WSS/MOSS sites
Based on ASP.NET 2.0 model
Minimum required crawler permission is now
just Full Read, not Administrator
Still provides the same security trimming
functionality
Ability to remove single items
- Slide 25: Administration Improvements
Customization
Search in every company is different
Different metadata might matter:
Documents: Title, Author, File location, Size
Records: Patient, Doctor, Healthcare provider, SSN…
How users meaningfully scope searches differs:
“All finance documents”
“All patient records”
“All published documents”
Customize results to “pop” metadata that
matters
Customization offered at many levels
Web Parts, XSLT/CSS, full object model…
- Slide 26: Administration Improvements
Query Reporting
Best way to improve Search
is to understand current usage
New out-of-box usage reporting:
Query volume trends, top queries,
click-through rates, queries with zero
results, etc.
At both site and service provider levels
Export data for extended reporting in
Excel
Respond to feedback with configuration
changes or editorial results
- Slide 27: Performance Improvements
Key new features make the crawls faster so
the content is fresher
More efficient SharePoint crawling
(Change Log Crawl)
Continuous propagation
Unified WSS and MOSS search
Security Change Only Crawl
Maximum scale is 10s of millions
of documents per indexer
- Slide 28: Demo – MOSS 2007
Goal of demo is a high level overview with
focus on:
•Search boxes and advanced search
•Search results experience
•Search Center
•Admin experience
- Slide 29: Questions?
- Slide 30: Module 3
Architecture and Deployment
Scenarios
- Slide 31: Agenda
Key concepts
MS Search Architecture
Deployment Building Blocks
WSS v3 Search Topologies
MOSS 2007 Search Topologies
Search Topology scenarios
Small
Medium
Large
Geographically distributed
Solution scenarios
Collaboration sites
Enterprise portal
Internet facing portal
- Slide 32: Microsoft Search Architecture
OOB Search UI/Custom Search Apps
Query OM and Web Service
Results
Query
Ranking
Search Configuration Data
Query Engine
Keywords
Best Bets
Stemmers
Content
Schema
Index
Protocol
iFilters
Scopes
WordBreakers
Handlers
Crawl Log
Index Engine
Content Sources
External Network
SharePoint
Notes Exchange Business
…
Web Sites Shares
Sites
Folders Data
Information
- Slide 33: SharePoint Search Topologies:
Deployment Building Blocks
Physical building blocks:
Web Front-End Servers
Application servers (Query, Index, Excel Services, etc.)
SQL Databases
Search functionality segmented into two roles:
Indexer
Query
MOSS 2007 specific
Shared Service Provider (SSP)
Indexer
Web Application(s)
Site Collection(s)
Content Database(s)
Virtual Server(s) (IIS)
- Slide 34: WSS v3 Search Topology Basics
WSS uses both server roles on the same
machine (“Search Server”)
Indexing
Query
Ability to index local content only
Site Collection (content database(s))
Content is automatically indexed
minimal search administration
Ability to query at a site and below it
stsadm command exposes some admin
operations
Can Crawl Multiple content databases
- Slide 35: Sample WSS v3 Topology
User Requests
Load
X Balancer
...
Web Front Ends
...
Search Server – Indexing
and Query
Crawling Crawling
...
Content Databases
- Slide 36: WSS v3 - Topology
Considerations
Scale out just like WSS
Add content databases for content
Add search servers for search
Each search server can serve up to
100 content databases
Could be lower depending on the data in
the content database
- Slide 37: MOSS 2007 Search Topology
Basics
Adds new functionality over base WSS
Search
Application server roles can be
separated:
Indexer
Query server
Propagation from indexer to query
servers
Crawl local + external content
Enhanced administration experience
Ability to search across site collections
- Slide 38: MOSS 2007 Search Topology
Basics (cont)
Query role can be assigned to one or
more servers
Indexing role can only be assigned to a
single server
Multiple query servers not allowed IF
server is providing both indexing and
query services
Only one index per SSP . . . although
you can have multiple SSPs
- Slide 39: Sample MOSS 2007 Topology
User Requests
X Load Balancer
Query servers
separated from
...
indexer
Web front ends
...
Query servers
Propagation
of indexes
...
External
Indexer
content
Indexer
crawling local +
Crawling
... external
Content
content
databases
- Slide 40: MOSS 2007 – Search Topology
Considerations
Indexing operations are CPU intensive
Dedicated query servers *might* be
better in a query heavy environment
MOSS / WSS crawls do involve making
HTTP requests against the WFE(s)
Dual role, WFE / Query servers more
efficient with security trimming
All servers should be on same network
segment
- Slide 41: MOSS 2007 – Search Topology
Considerations (cont)
Each farm can index up to 50 million
items
Beyond this, add more farms
Hardware is important
- Slide 42: Shared Search Service
Shared Service Provider (SSP) – grouped high-
value, resource intensive services
Shared services are consumed by web
applications (and sites within them)
“Always on” shared services – all sites in a
web application use the same index
Resource intensive operations controlled
centrally
Some admin experience is manageable at site
level
- Slide 43: Search Shared Service
Shared Service Search service
External content
Provider (SSP) People service
…
Virtual Servers
http://hr
http://finance
http://sales
spsite
spsite
spsite spsite spsite spsite
spweb
spweb spweb
spweb
spweb spweb
Content
Databases
- Slide 44: Search Shared Service
Search service
Shared Service External User Requests
People service
Provider content
…
X Load Balancer
...
Virtual Servers Web front ends
...
http://hr
http://finance
http://sales
Query servers
Propagation
spsite
spsite
spsite spsite spsite spsite of indexes
...
Content Indexed Indexer
spweb
spweb spweb
spweb
spweb spweb
Crawling
...
Content
Content
databases
Databases
- Slide 45: Common Search Topologies
Deployment scenarios
Small
Medium
Large
Geographically Distributed (MOSS only)
- Slide 46: Small Search Deployment
WSS
Single Search Server with both roles
Index
Single Site Collection only!
Single Set of Content Databases
Query
MOSS
Single Server
Dual Role
Index
SSP Based – Multiple Site Collections
Multiple Set of Content Databases
Query
MOSS for Search
Single Server / Dual Role (Index and Query)
- Slide 47: Medium Search Deployment
WSS
Multiple Search Servers with the following limitations
Single Index Server
Single Site Collection
Single Set of Content Databases
Multiple Query Servers
MOSS
Three Servers
One Index Server
Two Query Servers running on two Web Front-End servers
MOSS for Search
Three Servers
One Index Server
Two Query Servers
- Slide 48: Large Search Deployment
WSS
Multiple Search Servers with the following limitations
Multiple Index Servers (64-bit)
Each Indexing a Single Site Collection with their own Set of
Content Databases
Index Servers are not redundant from one another.
Multiple Query Servers each associated with their own single
Index Server running on the same machine (64-bit)
Query servers are not redundant from one another
MOSS
One Index Server (64-bit)
Many Separate Query servers (64-bit)
MOSS for Search
One Index Server (64-bit)
Many Separate Query servers (64-bit)
- Slide 49: Geographically Distributed Sites
MOSS Search Deployment
Corp. Sites
Search service
Shared Service External content
People service
Provider (SSP) ---
Index Corp, EMEA, APAC
and other locations
Virtual Servers
http://hr
http://finance
http://sales
spsite
spsite
spsite spsite spsite spsite
spweb
spweb spweb
spweb
spweb spweb
Search service
Search service
Shared Service People service
Shared Service People service External content
External content
Provider (SSP)
Provider (SSP) ---
---
Index EMEA only
Index APAC only
Virtual Servers
Virtual Servers
http://emeahr
http://emeafinance
http://emeasales
http://apachr
http://apacfinance
http://apacsales Other
Locations
spsite
spsite
spsite
spsite spsite spsite spsite spsite
spsite spsite spsite spsite
spweb spweb
spweb spweb
spweb spweb
spweb spweb spweb spweb
spweb spweb
- Slide 50: Deployment Scenarios
Collaboration Environment (WSS v3)
Enterprise Portal (MOSS 2007)
Internet Facing Portal (MOSS 2007)
- Slide 51: Collaboration Environment
Scenario WSS v3
iTech – startup software consulting
firm
Large number of disjoint teams
working on projects of varying
durations
Team sites used for collaboration and
communication
- Slide 52: Collaboration Environment Scenario
WSS v3 (cont)
User Requests
WSS farm with single
Load
X
IIS virtual server Balancer
http://team
Web Front Ends
Scales to large number
of team sites
Content indexed
Search Server – Indexing
automatically and Query
WSS v3 standalone
topology Crawling
1 Search box (both
Content
roles) Databases
- Slide 53: Collaboration Environment
Scenario WSS v3 (cont)
Virtual Server
http://team
Search – core
SPSites
feature of WSS
team3
team2
team1
Contextual scopes
– site and list
No search across spweb spweb spweb
spweb
sites
Content
Databases
- Slide 54: Enterprise Portal Scenario
MOSS 2007
iTech – growing company with growing
needs
iTech – needs a single point for
information access for employees
They now need to search over other
repositories:
Personnel records – People search
Seibel sources – BDC search
File Shares / Web sites – other external
data
- Slide 55: Enterprise Portal Scenario
MOSS 2007 (cont)
Upgrade from WSS MOSS
Search is a shared service through the SSP
Central enterprise portal – http://itech
Existing virtual server http://team associated
with SSP – search box switches to use
MOSS
Base WSS search is not running – but
search available to sites through shared
search service
Indexes – local and external content
- Slide 56: Enterprise Portal Scenario
MOSS 2007 (cont)
Farm Search service
External content
People service
Shared Service
Provider …
Virtual Server Virtual Server
http://itech
http://team
SPSites SPSites
team3
team2 Sales Finance
HR
team1
spweb spweb
spweb spweb
spweb spweb
spweb spweb
Content Content
Databases Databases
- Slide 57: Enterprise Portal Scenario
MOSS 2007 (cont)
Topology with
User Requests
X Load Balancer
indexer and
query servers Query Servers
Web front ends
added for
Load balanced
throughput
query servers
Query servers
Scale out and
Propagation
scale up – new
of indexes
SSP dimension Indexer
Single indexer
crawls logical SSP Crawling
= local + external Content
content databases
- Slide 58: Internet Facing Portal
Scenario - MOSS 2007
Internet facing site for customers –
www.itech.com
High traffic focused on content
presentation
Public access
More publishing and less collaboration
Controlled and tightly managed
content
- Slide 59: Internet Facing Portal
Scenario - MOSS 2007 (cont)
Two separate farms: Production and
test farms
MOSS installation
Controlled publishing of content to
production farm from test farm
Single shared service provider per
farm
Shared search service in each farm
crawls content in each farm
independently
- Slide 60: Internet Facing Portal
Scenario - MOSS 2007 (cont)
Production
Test
farm
SSP
SSP Farm Search service
Search service
People service
People service
---
---
Virtual Server
Virtual Server
www.itech.com
http://itechtest
SPSites
SPSites
About
About
Customers
Customers Services
Services itech
itech
spweb spweb
spweb spweb spweb
spweb spweb
spweb
Content Content
Databases Databases
- Slide 61: Questions?
- Slide 62: Module 4
Crawl and Query Processes
- Slide 63: Agenda
The Crawl Process
Crawl Walkthrough
Index Propagation
The Query Process
- Slide 64: Crawl Walkthrough
When a crawl is requested . . .
2. Indexer grabs the start address of
content source
3. Start address is prefixed with protocol
associated with accessing the content
4. Appropriate protocol handler invoked
to traverse the content source
5. During traversal, the handler will
identify content nodes it needs to
index
- Slide 65: Crawl Walkthrough (cont)
Protocol handler invokes IFilter
1.
associated with content node type
IFilter identifies and extracts properties
2.
from content node
Protocol handler supplements IFilter
3.
data with additional property
information
Data associated with content node is
4.
added to index
Index “delta” propagates to search
5.
servers
- Slide 66: Crawl Overview Diagram
Filter Daemon Search Process
URL
Filtering Word
Chunks Thread breakers
IProtocolHandler
pool
Shared
IFilter
Memory
SSP Catalog
Chunks
URL
Documents
Protofcol
Filter
Handler
Metadata Property
Gatherer Indexer
Extraction Store
· URL History
· Crawl Queue
· Property Store
SQL Server Catalog
- Slide 67: Index Propagation
Farm Sample User Requests
Load Balancer
Web
front
ends
Query
Servers
Index Propagation
Indexer
Crawling
- Slide 68: Index Propagation
Propagation will occur only when
the index and search components
are on separate servers
Continuous propagation
Changes sent incrementally to all query
servers associated with the index server.
Merging of the index occurs on the query
servers after propagation.
Query servers continue serving queries
while propagation is in progress
- Slide 69: Index Propagation
Index File Location
Set in Office SharePoint Server Search
Service settings
Default location: C:Program FilesMicrosoft Office
Servers12.0DataOffice ServerApplications
Can be programmatically set using the stsadm command
Index Server:
“stsadm.exe -o editssp –indexlocation index file path”
Query Server
“stsadm.exe –o osearch –propagationlocation index file path”
- Slide 70: The Query Process
Query Initiation and Results
Presentation
Query Execution
Query Walkthrough
- Slide 71: Query Initiation and Results
Presentation
Typically, provided by the WSS / MOSS
WFE role, through OOB WebParts
Could be an Office client or other
custom application
Responsible for constructing the “full”
query and communicating with the
query execution services
- Slide 72: Query Execution
Always provided by a server tagged
with the Query role
Consumes a query request
Executes the request using the query
index on the file system as well as the
SSP search database (if MOSS)
Handles OOB security trimming
Returns requested properties of the
result set to the caller
- Slide 73: Query Walkthrough (cont)
When a query is requested . . .
2. Query terms collected
3. Terms supplemented with contextual
information
4. Query formulated and issued through the
Query OM or the Web Service
5. Query is executed against the index and
property store
6. Query results returned
Results are ordered according to their relevance
to the query words
Trimmed based on the user’s permissions.
- Slide 74: Questions?
- Slide 75: Module 5
The Search End-User Experience
- Slide 76: Module Agenda
Introducing the Search End-User
Experience
Customizing Search
People Search
- Slide 77: Introducing the Search End-
User Experience
Complete Search experience
Search is everywhere
Tab-based user interface for easy
navigation
Easy to extend and customize
- Slide 78: Introducing the End-User Search Experience
Search Boxes
Search Center
Search Web Parts
- Slide 79: OOB Search UI/Custom Search Apps
Query OM and Web Service
Query OM
Results
Query
Hidden Object Http: Post
Http: Get
Advanced
Search
XML XML XML
Box Search
Web Parts
XSL
Transformation
- Slide 80: Search WebParts
Nine Standard Search Web Parts
Search Box
Core Results
High Confidence
Statistics
Pagination
Action Links
Matching Keywords and Best Bets
Search Summary (Did you mean?)
Advanced Search
- Slide 81: Result page infrastructure
Data shared through hidden object
All Search Web Parts within the same page share
the same hidden object
Connection between Search Web Part is
automatically done
Need only to Drag and Drop (or select) a Search
Web Part on the page
Allows for rapid page design
Hidden Object is internal and cannot be used by
custom Web Parts
All Search Web Parts derive from Data Form
Web Part
- Slide 82: Advanced Search
Allows power searchers to exercise greater
control on how they query
A link from the search box
Control what is displayed in the page by
modifying the xml stored in the web part
property “Properties”
i.e., can be used for displaying a new
language check box
Not provided by WSS Search UI
Implemented using the SQL syntax
- Slide 83: Customizing the End User
Experience
Search in every company is different
Different metadata might matter
Documents: Title, Author, File location, size
Records: Patient, Doctor, Healthcare provider, SSN…
Multi- or single-languages
How users meaningfully scope searches differs
“All finance documents”
“All patient records”
“All published documents”
Customize results to “pop” metadata that
matters
Customization offered at many levels
Web Parts, XSLT/CSS, full Object Model…
- Slide 84: Customization Choices
Search Center
Simple Site with few pages
Default Page
Result Page
Advanced Search Page
People Search Page
Results Pages
All Sites Results Page
People Results Page
Advanced Search Page and Web Part
Show Scope Picker
Scopes
Property Picker
Languages
Search Web Parts
- Slide 85: Customizing Search
Adding Search Center Tabs
Customizing Search Web Parts
Customizing Search Results
- Slide 86: People Search
Bring people into the search experience
Getting your job done means working with
the right people
Find subject matter experts based on their
knowledge and contacts
People list can come from AD, SQL, others
Discovering Experts
People are as important as data!
- Slide 87: People Search
People Results
Customizing Results
- Slide 88: Refine Your People Search
Refine by Job Title
Searches for the selected Job
Title
Refine by Department
Searches for the selected
Department
“Show more options” link (6+)
Listed in order of frequency
- Slide 89: People Search Web Parts
Two OOB People Search Web Parts
People Search Box
People Search Core Results
Inherit from the Search Core Results Web Part
Can be mixed on the same page with
other Search Web Parts
- Slide 90: People Results Search Web
Parts
Web Part properties such as:
(similar to Core Search WP)
Formatting (i.e. width of the search
box)
Number of Results per page
Display “Alert Me”, “RSS” links
Turn stemming on/off (default “off”)
Remove Duplicate Results on/off
(default “on”)
Fixed keyword Query
Select Columns
Results formatting with XSL
Social Distance (view)
- Slide 91: Social Distance Colleagues
Suggested Colleague list
members are mined from:
Microsoft Windows
Messenger (IM)
Microsoft Office
Outlook e-mail
(Outlook Add-In)
- Slide 92: Questions?
- Slide 93: Module 6
Search Object Model
- Slide 94: Workshop Agenda
Scenarios for Extending Search
Query Syntax
Query Object Model
Query Web Service
- Slide 95: Topic: Scenarios for
Extending Search
In this first section we will examine 2
scenarios for extending Search:
Integrate with Search Center
Integrate Search into 3rd party sites and
applications
- Slide 96: Integrate with MOSS Search Center
Extending Search
Use cases:
Use Search URL request parameters to add
predefined saved searches
Build custom search box Web parts for
custom look and feel
Build custom search core result Web parts
for own look and feel and customized
querying
- Slide 97: Integrate MOSS Search into 3rd Party
Sites and Applications
Extending Search
Build 3rd party user interface which
leverages MOSS Search through Web
Services
Use cases
Add MOSS Search features into existing
Web sites
Add MOSS Search into existing line of
business or custom applications
- Slide 98: Topic: Query Syntax
In this section we will examine the three
types of search syntax for building search
queries supported by MOSS:
Keyword
URL
SQL
- Slide 99: Keyword Syntax
Overview
Used in standard Search Box
New keyword syntax
Simple and easy to use
Consistent property:value syntax
across Office, Windows and Live
search
gallery hinges –brass site:http//supportdesk scope:Products
- Slide 100: Keyword Syntax
Include/Exclude
Build-in support for using include and
exclude terms
Look for term bike, but not related to
fitness
bike -fitness
Look for phrase “SharePoint Services”
but not the term v2
+”SharePoint Services”-v2
Include is implied when is no (+/-) prefix
- Slide 101: Keyword Syntax
Boolean Search
Narrowing results by default
Searches using “AND” between query terms
Does not recognize logical operators like
“OR”, “NEAR” as keywords – it treats them
all as search terms
Does not support complex queries like (A
AND B) OR (C AND D)
Complex Boolean searches are supported by
the engine and the SQL syntax
- Slide 102: Keyword Syntax
Property restrictions
• Supports property:value as part of the
keyword string
• Can use any managed property
• Supports the use of phrases
Can be used for exact matches when the property
value includes spaces
Without quotes then prefix matching is done. Supports
word stemming
- Slide 103: Keyword Syntax
No wildcard support
No wildcard support in Keyword Syntax
Search box does not do wildcard searching. The
following is not recognized as a wildcard search
ShareP*
Use Advanced Search property restrictions to
look for parts of a word
Requires new search results Web parts
Wildcards are supported by the engine and
the SQL query syntax
- Slide 104: URL Syntax
Use Case
Launching a URL in custom application
Save Searches
Custom search boxes
Request Parameters
Content: results.aspx?k=fish
Scopes: results.aspx?k=fish&s=BBC
Sort:
results.aspx?v=date
results.aspx?v=relevance
Page: results.aspx?start=21
- Slide 105: SQL Syntax Overview
SQL Syntax offers:
Consistent SQL across enterprise and
desktop
Complex queries and Boolean searches
Comparison operators
Arbitrary groupings for AND, OR, NOT
Freetext()
CONTAINS()
LIKE
ORDER BY ASC | DESC
Custom SQL query statements
Wildcard support
- Slide 106: SQL Syntax
Complex Boolean Searches
Write complex Boolean searches using
AND, OR, NOT
- Slide 107: SQL Syntax
FREETEXT predicate
Returns documents for which the
following is true:
Document contains all the search terms in
at least one of the columns specified
One of the search terms must also be
found in the Contents column
Use only one FREETEXT predicate for
most optimal ranking
The FREETEXT predicate also
supports (+/-)
- Slide 108: SQL Syntax
Wildcard Support
Get wildcard support using the
CONTAINS predicate:
Wildcard: Words or phrases with an
asterisk (*) added to the end.
WHERE CONTAINS
('
"compu*" NEAR "soft*"
')
- Slide 109: SQL Syntax
Removed from SQL syntax
Removed in MOSS 2007
Query property weights
UNION ALL
MATCHES
SELECT *
COALESCE TABLE
- Slide 110: Topic: Query Object Model
In this section we will examine:
The Query Object Model
The Query Object Path
The Query Web Service
- Slide 111: Query Object Model
New object model
Use the query object model to:
Build custom search user interface, like
Web parts or ASPX applications
Gain direct access to query and results
properties
Invoke custom queries
2 types of query syntaxes:
Keyword
SQL
- Slide 112: Query Object Model
Features
Managed code API
Single request – multiple results
Result Types Optional parameters
• •
Relevant results # of Sentences in
Summary
• High confidence
•
results Implicit - AND/OR
• •
Special terms Number of results
• •
Definitions Ignore noise words
• Enable stemming
• Language
- Slide 113: Query Object Path
Input Query OM Output
Keyword
Query Site UI
ResultTable:
ResultTableCollection
IDataReader
Custom Client
Relevant
SQL results
Query
Execute() Definitions Local
Query
Engine High
Optional confidence Remote
Parameters
Special
terms
- Slide 114: Query Web Service
Use and Methods
Use Case
Leverage Search in remote sites or
application
Office Research Pane
Methods
Query
QueryEx
GetSearchMetaData
Registration
Status
- Slide 115: Query Web Service
Search Center Features
Standard Search Center features not
built into the Web service
Hit highlighting
Search usage reporting
Search logging
Search statistics
Result type icons
Using Query vs. QueryEx
Implementing hit highlighting
- Slide 116: Questions?
- Slide 117: Module 7
Administration
- Slide 118: Module Agenda
Administrative Architecture
Farm Administration
SSP Administration
Site Collection Administration
Site Administration
Search Usage Reporting
Administrative Tools
Lab: Adding Content Sources
Lab: Search Schema
- Slide 119: Administrative Architecture
Three Tier Administration
Web-based
Role- and Task-delineated
Controlled Delegation
Secure Isolation
Central Administration
IT Administrators
Farm-level
Status
Shared Services Resource
Business unit IT
management
Service-level
One per farm
Site Settings configuration E.g. Create new site
Business site owner E.g. Create search
Site specific content source, Search
configuration and tasks Scopes
e.g. Create new list
- Slide 120: Farm Management
(IT Administrators)
- Slide 121: SharePoint 3.0 Central Administration
Common Tasks
Manage Topology and Services
Servers in Farm
Services in Server
Security Configuration
Update Farm Administrator’s Group
Backup and Restore
Index
Search Database
Global Configuration
Timer Job Definitions
Timer Job Status
Manage Search Service
- Slide 122: Using Central Admin
- Slide 123: Operations – Topology and Services
Servers in Farm / Services on Server
Query Server(s)
Office SharePoint Server Search Service
Stop / Start
Office SharePoint Services
Help Search Service
Stop / Start
Index Server(s)
Office SharePoint Server Search Service
Stop / Start
- Slide 124: Operations – Backup and Restore
Perform a backup
Restore from backup
- Slide 125: Operations – Global Configuration
Timer Job Definitions
SharePoint Services Search Refresh
Disable / Enable (Change and update WSS search configuration)
Indexing Schedule Manager on MOSS
Disable / Enable
Timer Job Status
Succeeded / Failed
- Slide 126: Search Application Management
Manage Search Service
Farm-level Search settings
Proxy Server settings
Query and Index Servers
Server Listing and their Search
service
Shared Service Providers with
Search enabled
SSP name listing
Crawler Impact Rules
- Slide 127: Crawler Impact Rules
Configured through Central
Administration
Allows “throttling” of the indexer to
reduce impact of a crawl on a
particular server
Supports wildcards
Used in conjunction with crawl
schedules
- Slide 128: Crawler Impact Rules (cont)
Use . . . To . . .
* as the site name Apply the rule to all sites
*.* as the site name Apply the rule to sites with a dot in their
name
*.site_name.com as the site name Apply the rule to all sites in the
site_name.com domain
*.top-level_domain_name (such as *.com Apply the rule to all sites that end with a
or *.net) as the site name specific top-level domain name
? Replace any single character in a rule
- Slide 129: Shared Services Provider
(SSP)
Management
(SSP Administrators)
(Content Oriented Administration)
- Slide 130: Common Tasks
Configure Search Settings
Content Sources
Crawl Settings
Authoritative Pages Settings
Scopes
- Slide 131: Content Sources
Represent an arbitrary container of
information
Require at least one start address,
although multiple start addresses can
be provided
Start address cannot be reused
Requires a registered protocol handler
Five out-of-box content source types
are available, mapping to the five out-
of-box protocol handlers
- Slide 132: SharePoint Content Source
Includes both SPS 2003, MOSS 2007, WSS v2, and
WSS v3 sites
Can limit crawl to only sites specified in start address
or all sites found below one or more provided
hostnames
Crawler will use target site’s APIs to include security
information around content in the index
For SPS 2003 content sources, crawler account
requires “change” rights, which necessitates the
crawler having administrator rights
Examples: sps3://moss-01/ or
http://moss-01/sitecollection/
Content sources decoupled from scopes
- Slide 133: Web Site Content Source
Any content source available over
HTTP or HTTPS
If a SharePoint URL is provided, the
crawler will detect this and index it as
though it were a SharePoint content
source (this can be overridden with
crawl rules)
Page depth and server hops can be
controlled
- Slide 134: Web Site Content Source
(cont)
Security information around content is
not included in index
Dynamic personalization will result in
the index being populated with what
the crawler is presented with
Example: http://website or
http://www.somesite.com
- Slide 135: File Shares Content Source
Any content visible over a Windows
server shared folder
Some non-Windows shares *may* be
crawled, if that share can be presented
as a Windows share (for instance,
Samba with Linux, Services for Unix)
Start address can be the share root or
subfolders beneath it
Security information is picked up by
the gatherer
- Slide 136: Exchange Public Folders
Content Source
Allows the indexer to crawl a public
folder that exists on Exchange
Requires Outlook Web Access, as
crawl is done over HTTP
Includes messages, conversations,
and other collaborative content
URL presented in the search results
will point to a deep link within OWA
Example: http://owa/public/folder
- Slide 137: Business Data Content
Source
Allows the indexer to crawl metadata
exposed through the Business Data
Catalog
Can elect to include all Business Data
Applications or a selected number of
them
- Slide 138: Lotus Notes Content Source
- Slide 139: Crawling Schedules
Allow administrator to indicate the frequency
at which a content source will be re-crawled
(daily, weekly, monthly)
Can indicate what time the content source
should be crawled
Schedule should be driven by:
Anticipated change at the content source (is this
static content or content that is constantly
changing)
Business expectations around when content
changes should be reflected in the index
Schedule can always be modified
- Slide 140: Maximum File Size
Default file size limit is 16MB
To change the limit, you must add in
the registry new DWORD entry
MaxDownloadSize at
HKEY_LOCAL_MACHINESOFTWAREMicrosoftOffice
Server12.0SearchGlobalGathering Manager
Make sure to increase timeout value to
avoid timeout exceptions
Change the value using the Manage
Search Service page of the Central Admin
- Slide 141: Crawl Rules
Define exceptions to the “typical”
crawl process
Addresses can be pattern matched for
special treatment
Support exclusion
Support altering the authentication
mechanism
Examples of Crawl Rules
Testing of Crawl Rules
- Slide 142: Search Result Removal
(From Live Index)
Typically used when someone
discovers something in the index that
shouldn’t be there
Permits administrator to immediately
remove that content from the index
Crawl rule automatically created to
prevent that content from being indexed
in the future
Restoring that content requires
dropping the crawl rule and re-indexing
- Slide 143: Default Content Access
Account
Account used for crawling, by default
Can be overridden in the Crawl Rules
Set the default account to use when
crawling content
Minimum crawler permission is “Full Read”
(still provides the same security trimming
functionality)
Automatically configured for new sites
Do not use an Administrator Account to
avoid crawling unpublished versions of a
document.
- Slide 144: Metadata Property Mappings
- Slide 145: Server Name Mapping
Override how MOSS displays
search results
Hide file path
Sample: “file://moss/HOL” to
“http://moss.litwareinc.com”
- Slide 146: Search-based Alerts
Can be Activated / Deactivated
Deactivated after a reset of crawled content
Users can subscribe to an alert on a search
query
Alert is triggered if there are new or changed
items that satisfy the search query
An item is considered changed if its content
or metadata has changed
Timer service is used to issue all alerts notifications (See User Alerts in Site Settings)
Frequency can be set to Daily / Weekly
“Alert Me” and RSS links can be added/removed using their Web Part property
- Slide 147: Reset Crawled Content
Powerful action!
Will delete the content index!
Search Results will no longer be available
on the farm until the index has been rebuild!
Search alerts are deactivated unless the
administrator unchecks the check box.
Alerts should be activated after a full crawl
was performed.
- Slide 148: Specify Authoritative Pages
Helps prioritize Search Results - a way to
influence relevance results that are linked to
the authoritative pages, which will benefit
from a boost in rank.
Most authoritative
Second-level authoritative
Third-level authoritative
Sites to demote
- Slide 149: Scopes
Scopes are filters applied to search
results to narrow the results of a
search query
Types of Scopes
Scope Rules and Behaviors
Single-rule Scopes
Multi-rule Scopes
- Slide 150: Site Collection
Management
(Site Collection Administrators)
(Application Administrators)
- Slide 151: Site Collection Administration Options
Common Tasks
Search Settings
Search Scopes
Search Keywords
- Slide 152: Search Settings
Two Options
Use the Search Center and custom scopes in the
dropdown
The way to change standard Search Center URL for
search boxes
Do not use the Search Center – no custom scopes
- Slide 153: Site Level Scopes
Site Level Scopes display all scopes associated with a Site Collection
Display Scopes are a site-level feature that is purely UI
Administrator – Combine multiple scopes into one selectable item
Visitors – UI Search dropdown box (or checked boxes for the Advanced
Search page) populated with the scopes included in the display group
+
- Slide 154: Keywords and Best Bets
Prominently present editorially selected
search results
Keywords: Glossary of important terms
within your organization
Best Bets are associated with particular
search keywords
Not available across site collections
- Slide 155: Search Settings for Fields - NoCrawl
Set a NoCrawl attribute on one or
more columns within the site
collection
Column content will not be indexed!
Associated with Site Columns
(Content Types)
- Sli