Re: Fulltext search too slow



On 3 Aug., 03:44, Simon Sabin <SimonSa...@xxxxxxxxxxxxxxx> wrote:
Hello dc,

Memory plays a big part, if FT doesn't have room to play you will have a
problem. Fix the memory used by SQL to total memory - 1Gb and see what teh
results are.

Is this 2000 or 2005.

The latter has some huge leaps forward

Simon Sabin
SQL Server MVPhttp://sqlblogcasts.com/blogs/simons



On 2 Aug., 17:27, "Daniel Crichton" <msn...@xxxxxxxxxxxxxxxx> wrote:

DC wrote on Thu, 02 Aug 2007 06:57:59 -0700:

Hi,
I am using an FT index on a column (varchar(4000) with an average
length of 100) in a table that contains about 1 million rows.
My tiny performance test looks like
declare @t datetime set @t = getdate()
select count(*) from containstable(table_name, expression,
'"searchthis*"')
print datediff (ms, @t, getdate())
If "searchthis" matches about 1000 rows, the search will usually
take
about 2 seconds. But when it matches a lot of rows (some terms will
match up to 200.000 rows) then the search takes up to one minute.
With
5.000 matches the search already takes about 15 seconds. It helps a
bit
to restrict the ft search to Top-20000-of-rows but not enough and
I need many results (not to display them but to display how many
results of what type were found).
This is too slow for my application and I am sure the performance
has
degraded for some reason. I am rebuilding the ft index every week
and
reorganizing it every night. CPU and disk queues look relaxed so
there
might be a less apparent bottleneck.
Does someone know where I should start looking for an enhancement,
or
are these very rough performance figures normal for a 4 * 3 GHz
machine?
Thanks for any hint in advance,
Regards
DC
There are many factors that will affect the speed of your queries, so
it's hard to recommend any one thing to look for.

I've got a 2 * 3 Ghz server here that I run my websites on, with a
product database of around 750000 items. For searches with small
numbers of results (less than 5000 for instance), it runs pretty
quickly and there is little difference between using a wildcard and
not. Trying to find a word that will give me around 20000 results
without a wildcard, and maybe double that with one, is going to take
me a while to nail down so I can't really run any sort of comparison
here myself, but I can tell you that if I search for "the" it takes
28 seconds to return just over 300,000 results, and with "the*" it
takes only 13 secs to bring back just over 320,000 results. However,
this was after some intial runs so that data is cached - with
uncached data it takes 4 or 5 times as long. Of course, searching for
"the" followed by "the*" is not a great test, as some of the data
from the first search will remain cached for the second one to use,
but with repeated runs of just "the" I seem to get no improvement
beyond 25 seconds or so, yet "the*" drops from 13 secs to 6 secs with
repeats.

You'll have to do some extensive testing to work out how to improve
the result speed, but the obvious one is to limit the number of
results as far as possible, and to try to only allow searching for
words that will return few results.

Given the results I've found in my own database I'm going to run some
tests for actual searches customers have entered and investigate
whether wildcarding them will in general reduce the search times
(obviously at the expensive of less exact matches being returned),
however our average search response time is under a second for
customer entered searches anyway so it may just be a waste of time
for me (it's rare that a customer will enter a single common word
like "the" or "of" to search for, they tend to enter a few words that
return few matches).

Dan- Zitierten Text ausblenden -

- Zitierten Text anzeigen -

Thank you for sharing all this info, Dan!

It pretty much looks like I will not be getting much more performance
from the build in ft engine. With SQL Server 2000 I have been using
SQL Turbo, which was a lot faster and had great ranking features - but
it was poorly supported so I was happy to get away from it thinking
that MS finally found out how design a solid fulltext search - which
apparently they have not. Luckily I am not fully relying ft index, I
am also caching search terms and their results. I am using this for a
security paper database so the searches repeat often and most of the
hits can be answered without the ft index.

I ran the litte test script attached to get a better feel for the
dependency of search expression, number of hits and latency fo the
search.

expression: "bank*", hits: 177558,
elapsedms: 20746
expression: "oil*", hits: 5405,
elapsedms: 360
expression: "citibank*", hits: 20478,
elapsedms: 296
expression: "daimler*", hits: 11294,
elapsedms: 220
expression: "daimlerc*", hits: 10815,
elapsedms: 110
expression: "daimlerch*", hits: 10813,
elapsedms: 106
expression: "daimlerchr*", hits: 10807,
elapsedms: 96
That was uncached. Why did "bank*" take so much longer compared to
"citibank*"? Next run:

expression: "bank*", hits: 177558,
elapsedms: 1126
expression: "oil*", hits: 5405,
elapsedms: 46
expression: "citibank*", hits: 20478,
elapsedms: 126
expression: "daimler*", hits: 11294,
elapsedms: 106
expression: "daimlerc*", hits: 10815,
elapsedms: 93
expression: "daimlerch*", hits: 10813,
elapsedms: 80
expression: "daimlerchr*", hits: 10807,
elapsedms: 76
That of course looks better. Subsequent runs are about as fast but of
course SQL Server will wipe the cache rather soon. Another example,
uncached:

expression: "Stock*", hits: 104299,
elapsedms: 1233
expression: "Mineral*", hits: 580,
elapsedms: 110
expression: "Oat*", hits: 241,
elapsedms: 30
expression: "AG*", hits: 172584,
elapsedms: 2016
expression: "Corp*", hits: 10725,
elapsedms: 123
expression: "Option*", hits: 418614,
elapsedms: 7503
expression: "Amazon*", hits: 1104,
elapsedms: 13
Hm. This one was fine too, and the expressions were not cached. What
was the problem with the initial "bank*"?

I read something about the ft engine trying to connect to the internet
for it's wordbreaker, maybe that has to do with it? I will check.

Regards
DC
set nocount on

declare @t table (ex varchar(31));
insert into @t (ex) values ('"bank*"');
insert into @t (ex) values ('"Mineral*"');
insert into @t (ex) values ('"Oat*"');
insert into @t (ex) values ('"AG*"');
insert into @t (ex) values ('"Corp*"');
insert into @t (ex) values ('"Option*"');
insert into @t (ex) values ('"Amazon*"');
declare
@ex varchar(31),
@st datetime,
@hits int
declare c cursor for
select ex from @t;
open c;
fetch next from c into @ex;
while @@fetch_status = 0
begin
set @st = getdate();
select @hits = count(*) from containstable(tbl_instruments_ft,
expression, @ex);
print
'expression: '
+ convert(varchar, @ex)
+ ', ' + space(31 - len(@ex)) + 'hits: '
+ convert(varchar, @hits)
+ ', ' + space(10 - len(convert(varchar, @hits))) + 'elapsedms: '
+ convert(varchar, datediff (ms, @st, getdate()));
fetch next from c into @ex;
end
close c
deallocate c- Zitierten Text ausblenden -

- Zitierten Text anzeigen -

Hi Simon,

I have now configured the /3gb switch in order to make room for the ft
services. Sure enought the SQL Server service happily grabbed 3 GB so
I configured it to use 2 GB maximum.

In off-times, the ft performance looked well, the more broad search
terms would not take more than 3-4 seconds and the average search time
was about 100ms. But when queries started pounding on the server again
(and most of those queries are not related to the ft index) the search
time for broad terms went up to 20+ seconds (while the average was
still at OK at 190ms). There are no obvious bottlenecks like disk or
CPU.

I have been watching the msftesql.exe memory usage while load testing
ft searches. When idle, the process uses about 20MB of memory. When I
start testing, the usage is quickly jumping around between 20 and 130
MB, 130 MB beeing a rare maximum. Usually the memory usage is quickly
moving between 20 and 80 MB when the searches are active.

My ft catalogue files only occupy 100 MB on disk. My impression is,
that msftesql.exe never loads the full index into memory and is very
eager to release memory and that this reduces the ft performance of
broad terms.

Maybe there is a way to pin the index data or to somehow configure
memory usage of msftesql.exe?

Regards
DC

.



Relevant Pages

  • Re: Fulltext search too slow
    ... Fix the memory used by SQL to total memory - 1Gb and see what teh ... SQL Server MVPhttp://sqlblogcasts.com/blogs/simons ... elapsedms: 360 ...
    (microsoft.public.sqlserver.fulltext)
  • Re: Fulltext search too slow
    ... Fix the memory used by SQL to total memory - 1Gb and see what teh ... SQL Server MVPhttp://sqlblogcasts.com/blogs/simons ... elapsedms: 360 ... I have been watching the msftesql.exe memory usage while load testing ...
    (microsoft.public.sqlserver.fulltext)
  • Re: Fulltext search too slow
    ... Memory plays a big part, if FT doesn't have room to play you will have a problem. ... Fix the memory used by SQL to total memory - 1Gb and see what teh results are. ... hits can be answered without the ft index. ... elapsedms: 360 ...
    (microsoft.public.sqlserver.fulltext)
  • Re: Locking pages in memory
    ... So I think the 64-bit Standard does not support locked pages in memory. ... Rick Byham, SQL Server Books Online ...
    (microsoft.public.sqlserver.setup)
  • Re: Locking pages in memory
    ... the account SQL runs under) the permission to lock pages in memory. ... Server Standard, Enterprise, and Developer editions: Required for SQL Server ...
    (microsoft.public.sqlserver.setup)

Loading