Re: search phrase
- From: "Yair Nissan" <yair@xxxxxxxxxxx>
- Date: Mon, 12 Sep 2005 11:03:42 +0200
Hi John,
Thanks for your suggestion, I will test it ASAP.
Yair
"John Kane" <jt-kane@xxxxxxxxxxx> wrote in message
news:uuKelVytFHA.1472@xxxxxxxxxxxxxxxxxxxxxxx
> Clarification...
> I have directly confirmed with the SQL Dev team that this is functional
> using only FREETEXT or FREETEXTTABLE under
> SP4 (Service Pack 4), the most currently available service pack.
>
>
> -- John
>
> --
> SQL Full Text Search Blog
> http://spaces.msn.com/members/jtkane/
>
>
>
> "John Kane" <JohnKane@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
> news:D3BBB365-AAEE-429D-89BB-52873293F3E6@xxxxxxxxxxxxxxxx
> > Yair,
> > What you might want to try is using the SQL Server 2000 XML thersoursa
> > files
> > to include the company variations you are looking for. Note, that while
> > this
> > option is un-supported by Microsoft, I have directly confirmed with the
> > SQL
> > Dev team that this is functional using only FREETEXT or FREETEXTTABLE
> > under
> > SP3 as well.
> >
> > See KB article 309678 "FIX: Thesaurus Support is Not Available for
> > Full-Text
> > Search" at
> >
http://support.microsoft.com/default.aspx?scid=kb;en-us;309678&Product=sql2k
> > for details. Below are the steps necessary to enable this functionality.
> >
> > use pubs
> > go
> >
> > CREATE TABLE test_07062001 (
> > PK int NOT NULL IDENTITY (1, 1),
> > col char(50),
> > PRIMARY KEY (PK)
> > )
> >
> > insert test_07062001 (col) values('KW')
> > insert test_07062001 (col) values('Karen Delaney')
> > insert test_07062001 (col) values('Kalen Delaney')
> > insert test_07062001 (col) values('Karen Waterson')
> > insert test_07062001 (col) values('Karen Watterson')
> > insert test_07062001 (col) values('KWatterson')
> > insert test_07062001 (col) values('KarenW')
> > insert test_07062001 (col) values('Sunset Boulevard')
> > insert test_07062001 (col) values('Sunset Blvd')
> > insert test_07062001 (col) values('Blvd of Broken Dreams')
> > insert test_07062001 (col) values('Boulevard of Broken Dreams')
> > insert test_07062001 (col) values('Stephen Dybing')
> > insert test_07062001 (col) values('Steven Dybing')
> >
> > then build your FTS Index
> > sp_fulltext_database 'enable'
> > sp_fulltext_catalog 'Karen', 'create'
> > sp_fulltext_table test_07062001, 'create', 'Karen', 'PK'
> > sp_fulltext_column test_07062001, 'col', 'add', 1033
> > sp_fulltext_table test_07062001, 'activate'
> >
> > After the Full Population is complete, run the following SQL FREETEXT
> > queries
> >
> > select * from test_07062001 where freetext(col,'karen') -- 3 hits
> > select * from test_07062001 where freetext(col,'steven') -- 1 hit
> > select * from test_07062001 where freetext(col,'boulevard') -- 2 hits
> >
> > Navigate to c:\Program Files\Microsoft SQL
> > Server\MSSQL\FTDATA\SQLServer\Config find a file named tseng.xml (note
> > that
> > this is for UK English and you need US English)
> >
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ContentIndexCommon\LanguageResources\O
verride\SQLServer\English
> > (United States) and changing TsaurusFile to read tsenu.xml - be sure to
> > include the full path if you do).
> >
> > Modify your tseng.xml file and save it.
> >
> > <XML ID="Tahoe Thesaurus">
> > <thesaurus xmlns="x-schema:tsSchema.xml">
> > <replacement>
> > <pat>steven</pat>
> > <sub weight="1.0">Stephen</sub>
> > </replacement>
> > <replacement>
> > <pat>blvd</pat>
> > <sub weight="1.0">Boulevard</sub>
> > </replacement>
> > <replacement>
> > <pat>KW</pat>
> > <pat>KarenW</pat>
> > <pat>KWatterson</pat>
> > <pat>Karen Waterson</pat>
> > <sub weight="1.0">Karen Watterson</sub>
> > </replacement>
> > <replacement>
> > <pat>colour</pat>
> > <sub weight="1.0">color</sub>
> > </replacement>
> > <expansion>
> > <sub weight="0.5">run**</sub>
> > <sub weight="0.5">jog**</sub>
> > </expansion>
> > </thesaurus>
> > </XML>
> > */
> >
> > Now, re-run the following SQL FREETEXT queries & compare the results
from
> > above:
> >
> > select * from test_07062001 where freetext(col,'karen')
> > select * from test_07062001 where freetext(col,'steven')
> > select * from test_07062001 where freetext(col,'boulevard')
> >
> > Hope that helps!
> > John
> > SQL Full Text Search Blog
> > http://spaces.msn.com/members/jtkane/
> >
> >
> >
> > "Yair Nissan" wrote:
> >> Hi,
> >>
> >> We are using SQL 2000 sp3. We have a Customers table in which the
> >> customer's
> >> name is FTS indexed.. Since the insertion into the DB is made manually
> >> the
> >> customers' names are often misspelled or written with small
modifications
> >> from the already existing names in our DB. The table is using
> >> SQL_Latin1_General_CP1_CI_AS collation.
> >>
> >> We've removed all the content of the noise files since our customers
are
> >> from all over the world and we need to remove a different set of noise
> >> words
> >> for each country.
> >>
> >> I've been testing what the best search phrase for the customer name
might
> >> be, and currently we're using the following algorithm :
> >>
> >> 1. Remove all non alphanumeric characters from the customer name.
> >> 2. Remove the country oriented "noise" words - such as Inc, LTD for
US
> >> customers , Gmbh of German customers etc .
> >> 3. Split the customer name using the white space as the delimiter.
> >> 4. Start querying the DB :
> >> a. A FreeTextTable query with the full customer name (after
stages
> >> 1-3). If one of the results (after performing stages 1-3 on each of
them)
> >> is
> >> the exact match it means we found the customer in the DB and we take
the
> >> appropriate actions.
> >> b. If not, we remove the last word and go on until less than
> >> ((number
> >> of words in the original customer name) / 2) words are left. If we get
to
> >> that point it means that there is no record for the customer and we add
a
> >> new one.
> >>
> >> This query gives us the best results (performance is a different
> >> issue...)
> >> so far, however very often this algorithm doesn't work, especially when
> >> the
> >> customer name was entered with an extra white space ("Sun Microsystems"
> >> and
> >> "Sun Micro Systems") or using the plural version of the word ("Sun
> >> Microsystem" AND "Sun Microsystems").
> >>
> >> I've been looking all over the net for good articles about that subject
> >> but
> >> found very few. I would appreciate any help you can provide.
> >>
> >> Thanks,
> >> Yair
> >
>
>
.
- References:
- search phrase
- From: Yair Nissan
- RE: search phrase
- From: John Kane
- Re: search phrase
- From: John Kane
- search phrase
- Prev by Date: Re: Backing Up
- Next by Date: how to Catalog Deploy
- Previous by thread: Re: search phrase
- Next by thread: Backing Up
- Index(es):
Relevant Pages
|