Re: Thesaurus Problem
- From: Organic Man <davidmcmillin@xxxxxxx>
- Date: Wed, 29 Aug 2007 11:58:20 -0700
On Aug 29, 2:46 pm, Organic Man <davidmcmil...@xxxxxxx> wrote:
On Aug 29, 8:27 am, "Hilary Cotter" <hilary.cot...@xxxxxxxxx> wrote:
Please refer to this documenthttp://msdn2.microsoft.com/en-us/library/ms345186.aspx
Note how you are supposed to save it as "When you are editing thesaurus
files by using text editor tools, the files must be saved in Unicode format
and Byte Order Marks must be specified."
--
RelevantNoise.com - dedicated to mining blogs for business intelligence.
Looking for a SQL Server replication book?http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTShttp://www.indexserverfaq..com"Organic Man" <davidmcmil...@xxxxxxx> wrote in message
news:1188385539.078829.282070@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
On Aug 28, 9:30 pm, "Hilary Cotter" <hilary.cot...@xxxxxxxxx> wrote:
oops try this:
sp_configure 'show advanced options',1'
Reconfigure with Override
GO
sp_configure 'default full-text language'
GO
It looks like you need to edit tsenu.xml.
--
RelevantNoise.com - dedicated to mining blogs for business intelligence.
Looking for a SQL Server replication
book?http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL
FTShttp://www.indexserverfaq.com"Organic Man" <davidmcmil...@xxxxxxx>
wrote in message
news:1188321109.818266.43220@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
On Aug 28, 11:33 am, "Hilary Cotter" <hilary.cot...@xxxxxxxxx> wrote:
If you are searching in American English the thesaurus file is
tsENU.xml,
if
you are searching in British or International English its tsENG.xml.
do a select from your SQL Server.
select @@language
sp_configure 'show advanced options,1
Reconfigure with Override
GO
sp_configure 'default full-text language'
GO
--
RelevantNoise.com - dedicated to mining blogs for business
intelligence.
Looking for a SQL Server replication
book?http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL
FTShttp://www.indexserverfaq.com"Organic Man" <davidmcmil...@xxxxxxx>
wrote in message
news:1188306888.645753.16050@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
On Aug 28, 5:35 am, "Hilary Cotter" <hilary.cot...@xxxxxxxxx> wrote:
You don't need to rebuild your catalogs, you do need to restart
MSFTESQL,
but don't need to reboot.
What language are you querying in and what language is your content
it?
Is
it binary content?
--
RelevantNoise.com - dedicated to mining blogs for business
intelligence.
Looking for a SQL Server replication
book?http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL
FTShttp://www.indexserverfaq.com"Organic Man"
<davidmcmil...@xxxxxxx>
wrote in message
news:1188272200.321569.78670@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
On Aug 27, 10:49 pm, "Hilary Cotter" <hilary.cot...@xxxxxxxxx>
wrote:
Did you save your thesaurus file with Unicode encoding? After SP
2
it
must
be saved as a Unicode file.
--
RelevantNoise.com - dedicated to mining blogs for business
intelligence.
Looking for a SQL Server replication
book?http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL
FTShttp://www.indexserverfaq.com"Organic Man"
<davidmcmil...@xxxxxxx>
wrote in message
news:1188250168.087879.56150@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
I am trying to get the full-text search thesaurus to work in
Sql
Server 2005 Express edition. I live in the USA so I assume
the
tx.ENU.xml is the appropriate file to modify. I used Notepad
to
modify the tx.ENU.xml file and saved as unicode:
<XML ID="Microsoft Search Thesaurus">
<thesaurus xmlns="x-schema:tsSchema.xml">
<diacritics_sensitive>0</diacritics_sensitive>
<expansion>
<sub>littre</sub>
<sub>leydig</sub>
</expansion>
<replacement>
<pat>NT5</pat>
<pat>W2K</pat>
<sub>Windows 2000</sub>
</replacement>
<expansion>
<sub>run</sub>
<sub>jog</sub>
</expansion>
</thesaurus>
</XML>
I closed Management Studio and reopened to allow MSFTESQL
service
to
restart. Then I ran these queries:
SELECT FullDocNo
FROM FullDocuments
WHERE CONTAINS(SectionText, 'littre') *** returned 3 rows
***
SELECT FullDocNo
FROM FullDocuments
WHERE CONTAINS(SectionText, 'leydig') *** returned 169 rows
***
SELECT FullDocNo
FROM FullDocuments
WHERE CONTAINS(SectionText, ' FORMSOF(THESAURUS, littre) ')
***
returned 6 rows ***
Thus the Thesaurus does not seem to be working since it should
have
returned at least 169 rows. I rebooted my entire system to
make
sure
Sql Server is starting fresh.
Any help in sorting this out will be greatly appreciated.
Hi Hilary,
Yes, I saved as unicode and double checked to be sure.
Are there any specific instructions for restarting MSFTESQL
service,
other than closing and reopening Management Studio?
Does the full-text catalog need to be updated?
I have tried several things; stillno luck so far;
1. I am querying in English. To be safe I made the changes to the
ts.ENS.xml, ts.ENG.xml, and ts.ESN.xml files.
2. The datatype for the SectionText column that I am searching was
ntext. I inherited this db which was originally created in Access
and
migrated to Sql 2000. So I changed the datatype to nvarchar(max) to
make it conform to modern standards. Still no luck.
3. I created a new SQL Server 2005 db with full-text search enabled
and moved all the data over from the old one which I believe is
still
SQl 2000.
4. I wondered if case mattered in the xml file (since Leydig is in
caps in SectionText), so I altered the xml like this:
<XML ID="Microsoft Search Thesaurus">
<thesaurus xmlns="x-schema:tsSchema.xml">
<diacritics_sensitive>0</diacritics_sensitive>
<expansion>
<sub>Leydig</sub>
<sub>leydig</sub>
<sub>littre</sub>
</expansion>
<replacement>
<pat>NT5</pat>
<pat>W2K</pat>
<sub>Windows 2000</sub>
</replacement>
<expansion>
<sub>run</sub>
<sub>jog</sub>
</expansion>
</thesaurus>
</XML>
5. I wondered if this could be a limitation of Sql Server 2005
Express, so I moved everything over to my Vista machine running Sql
Server 2005 Developer Edition. No luck.
6. I wondered if there was something problematic with my choice of
search terms so tried new ones by changing xml and sql like this:
<expansion>
<sub>vagus</sub>
<sub>pneumogastric</sub>
</expansion>
SELECT FullDocNo
FROM FullDocuments
WHERE CONTAINS(SectionText, 'vagus') *** returned 213 rows ***
SELECT FullDocNo
FROM FullDocuments
WHERE CONTAINS(SectionText, 'pneumogastric') *** returned 514 rows
***
SELECT FullDocNo
FROM FullDocuments
WHERE CONTAINS(SectionText, ' FORMSOF(THESAURUS, vagus) ')
***returned 213 rows ***
So it still does not seem to work.
In case you are wondering, the db contains old antiquated medical
terms that are unfamiliar to modern clinicians and researchers but
are
useful when interpreting old medical manuscripts.
Hilary,I am out of ideas for now. Any thoughts on what is wrong?
Here is what I got with the T-Sql:
select @@language = us_english
sp_configure 'show advanced options,1'
Reconfigure with Override
GO
Msg 15123, Level 16, State 1, Procedure sp_configure, Line 51
The configuration option 'show advanced options,1' does not exist, or
it may be an advanced option.
sp_configure 'default full-text language'
GO
Msg 15123, Level 16, State 1, Procedure sp_configure, Line 51
The configuration option 'default full-text language' does not exist,
or it may be an advanced option.
Hilary,
Thanks for your patience and assistance.
I edited tsenu.xml with Notepad as follows:
<XML ID="Microsoft Search Thesaurus">
<thesaurus
...
read more »
Notepad does not provide any specific reference to Byte Order Marks
(BOM). But I did try saving in all the available formats in Notepad
(ANSI, Unicode, Big Endian, and UTF-8).
I tried using the XML editors in Visual Studio, Expression Web, and
Dreamweaver.
I was hopeful with Dreamweaver since the save options included a
default checkbox for "Include Unicode Signature (BOM)"
I then downloaded an XML editor and saved it from there.
Nothing changed the results I reported above.
To test if the file format is the problem, does it make sense for
someone to include the terms I am using in a tsENU.xml file that is
known to be in the correct format and send it to me as an attached
file that can be pasted into my FTData folder? That should eliminate
file format as an issue.
<expansion>
<sub>Leydig</sub>
<sub>leydig</sub>
<sub>littre/sub>
</expansion>
<expansion>
<sub>vagus</sub>
<sub>pneumogastric</sub>
</expansion>
.
- References:
- Thesaurus Problem
- From: Organic Man
- Re: Thesaurus Problem
- From: Hilary Cotter
- Re: Thesaurus Problem
- From: Organic Man
- Re: Thesaurus Problem
- From: Hilary Cotter
- Re: Thesaurus Problem
- From: Organic Man
- Re: Thesaurus Problem
- From: Hilary Cotter
- Re: Thesaurus Problem
- From: Organic Man
- Re: Thesaurus Problem
- From: Hilary Cotter
- Re: Thesaurus Problem
- From: Organic Man
- Re: Thesaurus Problem
- From: Hilary Cotter
- Re: Thesaurus Problem
- From: Organic Man
- Thesaurus Problem
- Prev by Date: Re: Thesaurus Problem
- Next by Date: Re: Thesaurus Problem
- Previous by thread: Re: Thesaurus Problem
- Next by thread: Re: Thesaurus Problem
- Index(es):
Relevant Pages
|
Loading