Re: WEIGHT confusion (SQL 2k, running on Win2k3)

Tech-Archive recommends: Fix windows errors by optimizing your registry

From: John Kane (jt-kane_at_comcast.net)
Date: 08/31/04


Date: Tue, 31 Aug 2004 16:28:16 -0700

Dunc,
Ok, what I think you really need is a multiple column join using a single
table example using an OR condition... (BTW, this is the 3rd one I've posted
today, although, not the exact version as below..;). Below is an example of
this...

use Northwind
SELECT distinct e.LastName, e.FirstName, e.Title
   from Employees AS e,
     containstable(Employees, Notes, 'BTS') as A,
     containstable(Employees, Title, 'Representative') as B
       where
         A.[KEY] = e.EmployeeID or -- NOTE OR condition here
         B.[KEY] = e.EmployeeID

Note, that I had to use the distinct keyword on the LastName column to
remove duplicate results. Try altering your query below to the above with
the column specific weights as well as OR between the KEY comparisons. While
I fully understand why you want to do this and that you're trying to develop
a viable solution to a failing of SQL FTS, you should know that this
multiple CONTAINSTABLE query will perform very poorly as each CONTAINSTABLE
clause is a "round trip" to the FT Catalog. So, as your table and therefore
the FT Catalog query grows, this query will get slower and slower...

Regards,
John

"Dunc" <dunc@ntpcl.f9.co.uk> wrote in message
news:e1SCiI6jEHA.2692@TK2MSFTNGP12.phx.gbl...
> Hi John,
>
> I've made the change to the JOIN statements (ex LEFT JOIN), but that
hasn't
> worked, as they keyword I'm searching for isn't in every field -
potentially
> only one - so it's never returning any results.
>
> This SQL code is running on a mirror of a site that's live and running at
> the moment - it's a bar review website, run by a bunch of hobby'ists (i.e.
> it's all done by volunteers - techies, marketing, and a bunch of journos).
> As much as I'd like 100k rows, 2.2k is actually a pretty high number for
the
> data it's storing. When we launch the next phase, there will be more like
> 40k, but that will be branched down by region to ~5k per region.
>
> I'm really appreciating your help - if you'd like to get a better idea of
> what we're trying to achieve, the site is at:
> http://www.fluidfoundation.com
>
> You'll see the search on the right hand side of every page, or in the
middle
> of the front page.
>
> Dunc
>
> "John Kane" <jt-kane@comcast.net> wrote in message
> news:Osb7SV5jEHA.2668@TK2MSFTNGP10.phx.gbl...
> > You're welcome, Dunc,
> > Not to worry, mostly, I need to know the SQL Sever version and OS
platform
> > info, our language settings, and row count of your FT-enable table...
This
> > is all basic configuration info necessary to troubleshoot SQL FTS
> issues...
> >
> > You're using SQL Server 2000 SP3 Developer Edition on Win2003 with the
> > default language of US_English on at table with 2,272 rows. Can I assume
> > that your goal here is to have the search keyword ("late" in your query
> > below) have a greater weight from one column vs. another?
> >
> > If so, I see two issues... One, is that you should use only "INNER JOIN"
> or
> > just JOIN (defaults to INNER JOIN) vs. LEFT JOIN in your query - see the
> > Northwind example below. Secondly, your table size of only 2,272 rows
may
> be
> > too small to get a significant variation of query specific RANK values
and
> > then be *influenced* by the Weight parameter. It would better to re-test
> > your modified query against a larger table, say with at least 10,000
rows.
> >
> > -- note, the use of JOIN (defaults to INNER JOIN) only...
> > use Northwind
> > go
> > SELECT e.EmployeeID, e.LastName, ct.[KEY], ct.[RANK]
> > FROM Employees AS e
> > JOIN CONTAINSTABLE(Employees, Notes, 'French') AS ct ON e.EmployeeID
=
> > ct.[KEY]
> > ORDER BY ct.[RANK] DESC
> >
> > If it is your goal to have the same keyword rank higher or lower in one
> > column over another columns, I may have another method for achieving
this,
> > but I need to test it out a bit with a variable weight parameter.
> >
> > Thanks,
> > John
> >
> >
> >
> > "Dunc" <dunc@ntpcl.f9.co.uk> wrote in message
> > news:ehc4Zw4jEHA.1764@TK2MSFTNGP10.phx.gbl...
> > > I've had to temporarily remove the item* searches, as it was returning
> too
> > > much rubbish.
> > >
> > > I'm a little nervous re: posting some of this info to a newsgroup, so
> I've
> > > renamed a few columns and table names. This query actually works
> > > *reasonably* well, though you'll notice that only one column has a
> weight
> > of
> > > anything apart from 1.0 - and that's the name column (which is the
most
> > > important). If I raise this up at all, the name seems to drop to a
> > > significantly less relevance, and everything goes pear-shaped.
> > >
> > > Thanks in advance,
> > >
> > > Dunc
> > >
> > > ---/ snip /---
> > >
> > > language: us_english
> > >
> > > version: Microsoft SQL Server 2000 - 8.00.760 (Intel X86) Dec 17
2002
> > > 14:22:05 Copyright (c) 1988-2003 Microsoft Corporation Developer
> > Edition
> > > on Windows NT 5.2 (Build 3790: )
> > >
> > > default full-text language: default full-text language 0 2147483647
1033
> > > 1033
> > >
> > > fulltext_catalogs: 5 MyDirectory e:\SQLServer\MSSQL\FTDATA 0 7
> > >
> > > fulltext_tables (amongst a few irrelevant tables):
> > > dbo VenueList PK_VenueList 1 1 MyDirectory
> > > dbo VenueExtras PK_VenueExtras 1 1 MyDirectory
> > >
> > > fulltext_columns (amongst a few irrelevant columns):
> > > dbo 1856725667 VenueList VenueName 2 NULL NULL 1033
> > > dbo 1856725667 VenueList Addr1 3 NULL NULL 1033
> > > dbo 1856725667 VenueList Addr2 4 NULL NULL 1033
> > > dbo 1856725667 VenueList Addr3 5 NULL NULL 1033
> > > dbo 1856725667 VenueList Addr4 6 NULL NULL 1033
> > > dbo 1856725667 VenueList Addr5 8 NULL NULL 1033
> > > dbo 1856725667 VenueList Postcode 9 NULL NULL 1033
> > > dbo 1856725667 VenueList Style 20 NULL NULL 1033
> > > dbo 1856725667 VenueList Theme 22 NULL NULL 1033
> > > dbo 1856725667 VenueList Description 23 NULL NULL 1033
> > > dbo 1856725667 VenueList NearestTransport 43 NULL NULL 1033
> > > dbo 1856725667 VenueList Review 48 NULL NULL 1033
> > > dbo 1245247491 VenueExtras ExtraName 2 NULL NULL 1033
> > >
> > > Venuelist Count: 2272
> > >
> > > Full Query from profiler (someone searched on the term "late"):
> > > SELECT * FROM (
> > > SELECT VenueID, VenueName, Addr1, Addr2, Addr3, Addr4, Review,
> > > ISNULL(B.RANK, 0) + ISNULL(C.RANK, 0) + ISNULL(D.RANK, 0) +
> ISNULL(E.RANK,
> > > 0) + ISNULL(F.RANK, 0) + ISNULL(G.RANK, 0) + ISNULL(H.RANK, 0) +
> > > ISNULL(I.RANK, 0) + ISNULL(J.RANK, 0) + ISNULL(K.RANK, 0) +
> > > ISNULL(L.RANK, 0) + ISNULL(M.RANK, 0) + ISNULL(N.RANK, 0) AS Rank
> > > FROM VenueList AS A
> > > LEFT JOIN CONTAINSTABLE(VenueList, Venue, 'ISABOUT ("late"
> WEIGHT(0.1))')
> > AS
> > > B
> > > ON A.VenueID = B.[Key]
> > > LEFT JOIN CONTAINSTABLE(VenueList, Addr1, 'ISABOUT ("late"
WEIGHT(1))')
> AS
> > C
> > > ON A.VenueID = C.[Key]
> > > LEFT JOIN CONTAINSTABLE(VenueList, Addr2, 'ISABOUT ("late"
WEIGHT(1))')
> AS
> > D
> > > ON A.VenueID = D.[Key]
> > > LEFT JOIN CONTAINSTABLE(VenueList, Addr3, 'ISABOUT ("late"
WEIGHT(1))')
> AS
> > E
> > > ON A.VenueID = E.[Key]
> > > LEFT JOIN CONTAINSTABLE(VenueList, Addr4, 'ISABOUT ("late"
WEIGHT(1))')
> AS
> > F
> > > ON A.VenueID = F.[Key]
> > > LEFT JOIN CONTAINSTABLE(VenueList, Addr5, 'ISABOUT ("late"
WEIGHT(1))')
> AS
> > G
> > > ON A.VenueID = G.[Key]
> > > LEFT JOIN CONTAINSTABLE(VenueList, Postcode, 'ISABOUT ("late"
> WEIGHT(1))')
> > > AS H
> > > ON A.VenueID = H.[Key]
> > > LEFT JOIN CONTAINSTABLE(VenueList, Style, 'ISABOUT ("late"
WEIGHT(1))')
> AS
> > I
> > > ON A.VenueID = I.[Key]
> > > LEFT JOIN CONTAINSTABLE(VenueList, Theme, 'ISABOUT ("late"
WEIGHT(1))')
> AS
> > J
> > > ON A.VenueID = J.[Key]
> > > LEFT JOIN CONTAINSTABLE(VenueList, Description, 'ISABOUT ("late"
> > > WEIGHT(1))') AS K
> > > ON A.VenueID = K.[Key]
> > > LEFT JOIN CONTAINSTABLE(VenueList, NearestTransport, 'ISABOUT ("late"
> > > WEIGHT(1))') AS L
> > > ON A.VenueID = L.[Key]
> > > LEFT JOIN CONTAINSTABLE(VenueList, Review, 'ISABOUT ("late"
WEIGHT(1))')
> > AS
> > > M
> > > ON A.VenueID = M.[Key]
> > > LEFT JOIN CONTAINSTABLE(VenueExtras, ExtraName, 'ISABOUT ("late"
> > > WEIGHT(1))') AS N
> > > ON A.VenueID = N.[Key]
> > > WHERE A.Status > 0
> > > AND (B.RANK IS NOT NULL
> > > OR C.RANK IS NOT NULL
> > > OR D.RANK IS NOT NULL
> > > OR E.RANK IS NOT NULL
> > > OR F.RANK IS NOT NULL
> > > OR G.RANK IS NOT NULL
> > > OR H.RANK IS NOT NULL
> > > OR I.RANK IS NOT NULL
> > > OR I.RANK IS NOT NULL
> > > OR K.RANK IS NOT NULL
> > > OR L.RANK IS NOT NULL
> > > OR M.RANK IS NOT NULL
> > > OR N.RANK IS NOT NULL)
> > > ) SubQuery ORDER BY Rank DESC, Venue
> > >
> > > "John Kane" <jt-kane@comcast.net> wrote in message
> > > news:OCmmFZtjEHA.3016@tk2msftngp13.phx.gbl...
> > > > Dunc,
> > > > Could you post the full output of the following SQL script as the
info
> > > would
> > > > be helpful in understanding this...
> > > >
> > > > use <your_database_name_here>
> > > > go
> > > > SELECT @@language
> > > > SELECT @@version
> > > > -- Note, may need to set 'show advanced options'
> > > > EXEC sp_configure 'default full-text language'
> > > > EXEC sp_help_fulltext_catalogs
> > > > EXEC sp_help_fulltext_tables
> > > > EXEC sp_help_fulltext_columns
> > > > EXEC sp_help VenueList
> > > > go
> > > > SELECT count(*) from VenueList
> > > > go
> > > >
> > > > Could you also run profiler and re-execute your below query with the
> > > > function BreakSearchStringToIsAbout and post the exact SQL code that
> it
> > is
> > > > generating? You might also want to review SQL Server 2000 BOL title
> > > > "Full-text Search Recommendations" and it's last paragraph on RANK.
> > > >
> > > > Thanks,
> > > > John
> > > >
> > > >
> > > >
> > > >
> > > > "Dunc" <dunc@ntpcl.f9.co.uk> wrote in message
> > > > news:e3IC#GtjEHA.2652@TK2MSFTNGP15.phx.gbl...
> > > > > I'm dynamically creating a SQL statement that takes each word from
a
> > > > > sentence, and using the WEIGHT command, searches a number of
fields
> > > twice
> > > > > for each word - once with a given weight for the exact word, and
> once
> > > with
> > > > > half the weight for a partial match (by putting asterisks at the
> start
> > > and
> > > > > end of each word).
> > > > >
> > > > > Going against all the documentation I've got, items with a lower
> > weight
> > > > seem
> > > > > to be coming up higher. After a bit more playing around (I
removed
> > the
> > > > > partial match functionality), the lower the number, the higher the
> > rank
> > > > > seems to be. I can't just multiply the partial match value, as
> it'll
> > > > > potentially go over one.
> > > > >
> > > > > Either I'm doing something dumb, or I've missed a pretty valuable
> > piece
> > > of
> > > > > documentation. Can someone point me in the right direction here?
> > I've
> > > > put
> > > > > a few key snippets of the code below.
> > > > >
> > > > > Thanks in advance,
> > > > >
> > > > > Dunc
> > > > >
> > > > > "FROM VenueList " & _
> > > > > "LEFT JOIN CONTAINSTABLE(VenueList, VenueName, 'ISABOUT (" &
> > > > > BreakSearchStringToIsAbout(arrSearchString, 0.5) & ")') AS B " & _
> > > > > " ON A.VenueID = B.[Key] " & _
> > > > >
> > > > > ---/ snip /---
> > > > >
> > > > > Function BreakSearchStringToIsAbout(arrSearchItem, iWeight)
> > > > > Dim iLoop
> > > > > Dim strOutput
> > > > >
> > > > > strOutput = ""
> > > > > For iLoop = 0 to uBound(arrSearchItem)
> > > > > strOutput = strOutput & """" & arrSearchItem(iLoop) & """" & "
> > > WEIGHT("
> > > > &
> > > > > iWeight & "), "
> > > > > strOutput = strOutput & """*" & arrSearchItem(iLoop) & "*""" & "
> > > > WEIGHT("
> > > > > & iWeight / 2 & "), "
> > > > > Next
> > > > >
> > > > > strOutput = Left(strOutput, Len(strOutput) - 2)
> > > > >
> > > > > BreakSearchStringToIsAbout = strOutput
> > > > > End Function
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>



Relevant Pages

  • saved queries based on other saved queries vs. one big query in a forms module using DAO
    ... flawed set of queries I made and now it's finally back to the drawing board. ... I have several saved queries, ... The activities each have a weight within that group and the ... The first time I tried this I made a query that got the summary infomation ...
    (microsoft.public.access.queries)
  • RE: subtraction
    ... completed all changes you suggested and the Query works until I try nto make ... a report with it or a form in datasheet view. ... Is there anyway to do this on the actual form where I enter the weight info? ... >>> In the report instead of having a running sum is there a way to have a ...
    (microsoft.public.access.reports)
  • Re: FreeTextTable Rank
    ... You're welcome, Alex, ... SQL FTS queries such as the one below. ... > and this is the query, ... The Rank value from FREETEXTTABLE is ...
    (microsoft.public.sqlserver.fulltext)
  • Re: How can I create a list query?
    ... "Lost in Query" wrote: ... weight gain or loss compared to each other. ... over on a Older table's NameID field. ... Newer table down in to field row ...
    (microsoft.public.access.queries)
  • Re: why does MATCH/AGAINST fail to catch entries that LIKE does catch?
    ... "The MySQL FULLTEXT implementation regards any sequence of true word characters as a word. ... Apostrophes at the beginning or the end of a word are stripped by the FULLTEXT parser; 'aaa'bbb' would be parsed as aaa'bbb. ... "Every correct word in the collection and in the query is weighted according to its significance in the collection or query. ... Consequently, a word that is present in many documents has a lower weight, because it has lower semantic value in this particular collection. ...
    (comp.lang.php)