Re: What characters are considered as word boudaries
- From: "Hilary Cotter" <hilary.cotter@xxxxxxxxx>
- Date: Wed, 6 Sep 2006 11:06:55 -0400
Its not configurable. Basically all non alpha numeric characters are not
indexed and are considered to be work boundaries. There are some
exceptions - handling of - in different languages, and the . If you have
something like F.B.I. it is indexed as F.B.I. and FBI. f.b.i. is broken into
f, b, and i.
Same with + and # after upper case characters, C# is indexed a C#, c#, is
indexed as c. $10.00 is indexed as $10.00, whereas $ is indexed as test.
--
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"yuan" <yuan@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:C6B39E6B-AC72-4F1A-AC76-D07B63AFD913@xxxxxxxxxxxxxxxx
Hi
I would like to know what are the caracters that are considered as word
boundaries in FTS and if there is a way to configure this list of
characters.
My question is about the '/' character which doesn't seem to be a word
boundary in FTS and I would like to change this behavior.
Thank's
.
- Follow-Ups:
- Prev by Date: Re: Performance issue with CONTAINS
- Next by Date: Re: how to make fuzzy search on ntext field
- Previous by thread: Performance issue with CONTAINS
- Next by thread: Re: What characters are considered as word boudaries
- Index(es):
Relevant Pages
|
Loading