Re: search phrase



Clarification...
I have directly confirmed with the SQL Dev team that this is functional
using only FREETEXT or FREETEXTTABLE under
SP4 (Service Pack 4), the most currently available service pack.


-- John

--
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/



"John Kane" <JohnKane@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:D3BBB365-AAEE-429D-89BB-52873293F3E6@xxxxxxxxxxxxxxxx
> Yair,
> What you might want to try is using the SQL Server 2000 XML thersoursa
> files
> to include the company variations you are looking for. Note, that while
> this
> option is un-supported by Microsoft, I have directly confirmed with the
> SQL
> Dev team that this is functional using only FREETEXT or FREETEXTTABLE
> under
> SP3 as well.
>
> See KB article 309678 "FIX: Thesaurus Support is Not Available for
> Full-Text
> Search" at
> http://support.microsoft.com/default.aspx?scid=kb;en-us;309678&Product=sql2k
> for details. Below are the steps necessary to enable this functionality.
>
> use pubs
> go
>
> CREATE TABLE test_07062001 (
> PK int NOT NULL IDENTITY (1, 1),
> col char(50),
> PRIMARY KEY (PK)
> )
>
> insert test_07062001 (col) values('KW')
> insert test_07062001 (col) values('Karen Delaney')
> insert test_07062001 (col) values('Kalen Delaney')
> insert test_07062001 (col) values('Karen Waterson')
> insert test_07062001 (col) values('Karen Watterson')
> insert test_07062001 (col) values('KWatterson')
> insert test_07062001 (col) values('KarenW')
> insert test_07062001 (col) values('Sunset Boulevard')
> insert test_07062001 (col) values('Sunset Blvd')
> insert test_07062001 (col) values('Blvd of Broken Dreams')
> insert test_07062001 (col) values('Boulevard of Broken Dreams')
> insert test_07062001 (col) values('Stephen Dybing')
> insert test_07062001 (col) values('Steven Dybing')
>
> then build your FTS Index
> sp_fulltext_database 'enable'
> sp_fulltext_catalog 'Karen', 'create'
> sp_fulltext_table test_07062001, 'create', 'Karen', 'PK'
> sp_fulltext_column test_07062001, 'col', 'add', 1033
> sp_fulltext_table test_07062001, 'activate'
>
> After the Full Population is complete, run the following SQL FREETEXT
> queries
>
> select * from test_07062001 where freetext(col,'karen') -- 3 hits
> select * from test_07062001 where freetext(col,'steven') -- 1 hit
> select * from test_07062001 where freetext(col,'boulevard') -- 2 hits
>
> Navigate to c:\Program Files\Microsoft SQL
> Server\MSSQL\FTDATA\SQLServer\Config find a file named tseng.xml (note
> that
> this is for UK English and you need US English)
> HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ContentIndexCommon\LanguageResources\Override\SQLServer\English
> (United States) and changing TsaurusFile to read tsenu.xml - be sure to
> include the full path if you do).
>
> Modify your tseng.xml file and save it.
>
> <XML ID="Tahoe Thesaurus">
> <thesaurus xmlns="x-schema:tsSchema.xml">
> <replacement>
> <pat>steven</pat>
> <sub weight="1.0">Stephen</sub>
> </replacement>
> <replacement>
> <pat>blvd</pat>
> <sub weight="1.0">Boulevard</sub>
> </replacement>
> <replacement>
> <pat>KW</pat>
> <pat>KarenW</pat>
> <pat>KWatterson</pat>
> <pat>Karen Waterson</pat>
> <sub weight="1.0">Karen Watterson</sub>
> </replacement>
> <replacement>
> <pat>colour</pat>
> <sub weight="1.0">color</sub>
> </replacement>
> <expansion>
> <sub weight="0.5">run**</sub>
> <sub weight="0.5">jog**</sub>
> </expansion>
> </thesaurus>
> </XML>
> */
>
> Now, re-run the following SQL FREETEXT queries & compare the results from
> above:
>
> select * from test_07062001 where freetext(col,'karen')
> select * from test_07062001 where freetext(col,'steven')
> select * from test_07062001 where freetext(col,'boulevard')
>
> Hope that helps!
> John
> SQL Full Text Search Blog
> http://spaces.msn.com/members/jtkane/
>
>
>
> "Yair Nissan" wrote:
>> Hi,
>>
>> We are using SQL 2000 sp3. We have a Customers table in which the
>> customer's
>> name is FTS indexed.. Since the insertion into the DB is made manually
>> the
>> customers' names are often misspelled or written with small modifications
>> from the already existing names in our DB. The table is using
>> SQL_Latin1_General_CP1_CI_AS collation.
>>
>> We've removed all the content of the noise files since our customers are
>> from all over the world and we need to remove a different set of noise
>> words
>> for each country.
>>
>> I've been testing what the best search phrase for the customer name might
>> be, and currently we're using the following algorithm :
>>
>> 1. Remove all non alphanumeric characters from the customer name.
>> 2. Remove the country oriented "noise" words - such as Inc, LTD for US
>> customers , Gmbh of German customers etc .
>> 3. Split the customer name using the white space as the delimiter.
>> 4. Start querying the DB :
>> a. A FreeTextTable query with the full customer name (after stages
>> 1-3). If one of the results (after performing stages 1-3 on each of them)
>> is
>> the exact match it means we found the customer in the DB and we take the
>> appropriate actions.
>> b. If not, we remove the last word and go on until less than
>> ((number
>> of words in the original customer name) / 2) words are left. If we get to
>> that point it means that there is no record for the customer and we add a
>> new one.
>>
>> This query gives us the best results (performance is a different
>> issue...)
>> so far, however very often this algorithm doesn't work, especially when
>> the
>> customer name was entered with an extra white space ("Sun Microsystems"
>> and
>> "Sun Micro Systems") or using the plural version of the word ("Sun
>> Microsystem" AND "Sun Microsystems").
>>
>> I've been looking all over the net for good articles about that subject
>> but
>> found very few. I would appreciate any help you can provide.
>>
>> Thanks,
>> Yair
>


.



Relevant Pages

  • Re: search phrase
    ... > SQL Full Text Search Blog ... >> Dev team that this is functional using only FREETEXT or FREETEXTTABLE ... Below are the steps necessary to enable this functionality. ... Remove all non alphanumeric characters from the customer name. ...
    (microsoft.public.sqlserver.fulltext)
  • Re: OOP - a question about database access
    ... >>and project so much better in SQL DBMSes than in ODBMSes, ... >>100x more bytes from the database, just because you want your objects to ... > a related invoice.. ... > assoication from the customer to the invoice collection and have done ...
    (comp.object)
  • Re: object databases
    ... Now, tell me, you can do all that with 2 lines of code with SQL ... didn't provide the Customer, Order and OrderDetail classes. ... If you try to map classes to tables, ...
    (comp.object)
  • RE: search phrase
    ... What you might want to try is using the SQL Server 2000 XML thersoursa files ... After the Full Population is complete, run the following SQL FREETEXT queries ... "Yair Nissan" wrote: ... Remove all non alphanumeric characters from the customer name. ...
    (microsoft.public.sqlserver.fulltext)
  • Re: sql problem
    ... Account # is the link field on ... The requirement is I have to come up with a report comparing the two ... customer tables to see if they are in sync. ... what I wanted to do was to get the SQL to return the "like" names ...
    (comp.databases.oracle.server)