Re: Incremental population stops

From: John (John_at_Comprompter.com)
Date: 10/06/04


Date: Wed, 6 Oct 2004 13:26:05 -0500

Hi John,

I got some information from one of the clients to answer your questions.
(There are apparently 2 clients with this issue.)

This client was able to send me a copy of the application event log, which I
have attached here. Apparently, after every restart of the Microsoft Search
service, there is a 1-3 hour gap until a warning of event # 3035 is
reported. This mentions a gatherer log, which I don't yet have access to.

There are also a couple MssCi entries, that refer to pausing a master merge
due to error 2147943624. It then says the merge will be rescheduled later.

I hope this helps you help me. :o) This client also mentioned over the
phone, that it is usually 3-4 days before the problem crops up. So, I had
him schedule a windows task that will stop and start the Microsoft Search
service every night. (We're hoping that this prevents the problem, but it
is basically a wild stab in the dark.)

John

"John Kane" <jt-kane@comcast.net> wrote in message
news:u4$DL7lqEHA.4004@TK2MSFTNGP10.phx.gbl...
> John,
> Yes, I would have to agree with both of your points, especially #2 unless
> other punctuation characters (such as ., # or ;, etc.) are in contact with
> the search words... I've attached a sql script file
> (FTS_with_Change_Tracking_pooling.sql) that demonstrates the time it takes
> to insert 1 row (or document) and then to get a successful return from a
> FTS
> query and how long it takes. I'd recommend that you alter this script and
> test it with your data and then change the WAITFOR DELAY until you get a
> valid response. Note, you should execute the DELETE to WAITFOR as a batch
> starting with 1 second and increasing until you get a return of 1 row
> affected.
>
> I've forwarded this script to Microsoft, but have yet to hear back as to
> whether or not this is a bug or a PRB. Most likely in your case, the
> MSSearch service is not able to take the push of data from SQL Server at
> your rate and the stop / restart of the MSSearch service clears up
> whatever
> error or issue is at the heart of this problem.
>
> I'd also recommend that you review the problem server's Application event
> log for any source events from "Microsoft Search" or MssCi (informational,
> warnings & errors) prior to the time of stopping & re-starting the
> MSSearch
> service. If you find any suspect entries, could you post them here?
>
> While the XML Ifilters might not be the solution for this issue, it was
> provide as FYI, in case you did hit the bug# I made reference to in my
> below
> reply.
>
> Regards,
> John
>
>
>
> "John" <John@Comprompter.com> wrote in message
> news:uLz$tklqEHA.556@TK2MSFTNGP11.phx.gbl...
>> Interesting. I'm not sure the bug you mentioned would apply, for two
>> reasons.
>>
>> 1) The search query succeeds on the new data after the search service has
>> been restarted, indicating (to me) that the service never indexed the row
> to
>> begin with.
>> 2) The bug sounds to me as if it could only affect words at the beginning
>> and end of one of our paragraphs (which is the only place the words the
>> users search for could meet with an XML element tag). But, once the
>> behavior is noticed, they realize that no query successfully returns data
>> from the newer items, but old data comes through.
>>
>> I'm also wondering what value the IFilter would really provide, since the
>> users aren't actually querying for XML content, but for content within
>> the
>> <paragraph></paragraph> tags within the XML. Also, it sounds like a lot
> of
>> work for us and another expense for the clients. I'd need to understand
> the
>> point a little better.
>>
>> John
>>
>>
>> "John Kane" <jt-kane@comcast.net> wrote in message
>> news:%23V6kDElqEHA.896@TK2MSFTNGP12.phx.gbl...
>> > You're welcome, John
>> > At the below frequency (timing) and amount (number of rows
>> > updated/inserted)
>> > levels, the log based sysfulltextnotify push to the MSSearch services
>> > *should* be able to keep up with the updates, both for your larger and
>> > smaller tables. However, without knowing what they were searching for
> and
>> > did not find in the recent updates, this may be a different issue and
>> > stopping & re-starting the MSSearch service may be hiding this issue.
>> >
>> > Since you are storing your XML data in a column defined with a TEXT
>> > datatype
>> > AND your are using SQL Server 2000 SP3 on Windows 2000 Server, the
> *real*
>> > issue may be how the XML metatags are or are not in contact with the
>> > search
>> > word. This can only be determined by looking at the actual XML code and
>> > determining if the <tag>search_word</tag> is in contact with the
>> > search_word
>> > for recently updated rows.
>> >
>> > There is a well known (within this newsgroup) bug (Shiloh bug# 351310
>> > "Full
>> > Text Search Win2K word breaker does not ignore punctuation unless
>> > separated
>> > by white space") that when search text is combined with HTML or XML
>> > metatags
>> > in non-IMAGE datatypes, that the search_word cannot be found because
>> > the
>> > XML
>> > or HTML metatags are in contact or touching the < or > punctuation
>> > characters on Windows 2000 Server, but not on Windows Server 2003 as it
>> > has
>> > a new wordbreaker. For a workaround to this bug on Win2K, you can use
> the
>> > Neutral "Language for Word Breaker" for your XML column and then run a
>> > Full
>> > Population. Note, if you switch to the Neutral wordbreaker, you will
> lose
>> > some SQL FTS functionality, specifically formsof(inflectional) that is
>> > language specific as the Neutral word breaker breaks the works based
> upon
>> > the "white space" between the words.
>> >
>> > A more long term solution is to store your xml files in a column using
> the
>> > IMAGE datatype, and use a "file extension" column of char(3) or char(4)
> to
>> > contain the file extension, for XML, use xml or .xml (note the period
>> > before
>> > xml). You then can download and install the Microsoft Sharepoint-based
> XML
>> > iFilter at
>> >
> http://www.microsoft.com/sharepoint/server/techinfo/reskit/XML_Filter.asp,
>> > keeping in mind that to index attributes/elements which have values
>> > greater
>> > than 32 characters you need to install OfficeXP locally on your IS
>> > machine.
>> >
>> > The XML iFilter will index (1) values of sub-elements of the root
> element
>> > when the sub elements have no child elements and (2) attributes of the
>> > root
>> > element and attributes of sub-elements of the root element. Consider a
>> > sample XML document:
>> >
>> > <?xml version="1.0" ?>
>> > <book title="YourBook">
>> > This a book chapter
>> > <author>
>> > First Last
>> > <AGE>20</AGE>
>> > </author>
>> > <ISBN>
>> > 222222222
>> > </ISBN>
>> > </book>
>> >
>> > The attribute Title of the element Book, and ISBN would be the only
> values
>> > that would be indexed and query able in this case.You might want to
>> > consider
>> > alternatives to Microsoft's Sharepoint XML Ifilter, such as: QuiLogic
>> > at
>> > http://www.quilogic.cc/ifilter.htm or 3 Tier Technology at
>> > http://www.3tt.com.au/products/xmlFilter/default.html
>> >
>> > I'd recommend that you first test the XML results differences on
>> > Win2003
>> > vs.
>> > Win2K and if that is not the true issue, then consider the above XML
>> > Ifilters as this will require a table schema change as well as related
>> > application change.
>> >
>> > Regards,
>> > John
>> >
>> >
>> >
>> > "John" <John@Comprompter.com> wrote in message
>> > news:uc2LpxkqEHA.1152@TK2MSFTNGP11.phx.gbl...
>> >> Hi,
>> >>
>> >> You're definitely digging deep, as I was hoping for.
>> >>
>> >> (In reference to the larger catalog) the best way to look at the
>> >> user's
>> >> updating patterns is to begin with the thought that all of the
>> >> articles
>> > are
>> >> immediately sent to the database after they are written (and in fact,
> may
>> > be
>> >> sent multiple times AS they are written and editted). There are only
> 20
>> > or
>> >> so users doing this work. So, the effect is that the data is not
>> >> being
>> > sent
>> >> to the server in batches at all. The model is much closer to
> individual
>> >> updates sent at evenly spaced intervals throughout the day. (Of
> course,
>> >> there is the remote possibility that 20 changes or new articles may be
>> > added
>> >> at once, via separate query operations.)
>> >>
>> >> The small catalog receives a stream of data that tends to come in
>> >> small
>> >> bursts of maybe 10 items at once.
>> >>
>> >> They both use a TEXT column type. The OS for the server in question
>> >> is
>> >> Windows 2000.
>> >>
>> >> Thanks,
>> >> John
>> >>
>> >>
>> >> "John Kane" <jt-kane@comcast.net> wrote in message
>> >> news:emZaNakqEHA.1160@tk2msftngp13.phx.gbl...
>> >> > You're welcome, John,
>> >> > I'll try to clarify my questions... Basically and I'm assuming that
>> >> > each
>> >> > article is a row in your FT-enable table and do they update all of
> the
>> >> > 100 -
>> >> > 500 articles at one time or is do they spread out the updates, such
>> >> > that
>> >> > 100
>> >> > articles are updated (or inserted) at 10:00am, another 100 at
> 12:00pm,
>> >> > another 100 at 2:00pm, and so on? If they update or insert all
> (appox.)
>> >> > 500
>> >> > articles at one time, then this may be too many articles for the SQL
>> >> > Server-to-MSSearch push of data (via an un-document system table
>> >> > sysfulltextnotify) to handle at one time.
>> >> >
>> >> > If they are doing the latter and batching the updates or inserts in
>> >> > batches
>> >> > of (approx.) 100 articles throughout the day, this *may* be a better
>> >> > strategy and allow the SQL Server-to-MSSearch push of data to the FT
>> >> > Catalog, time to be completed in a reasonable time. Although, 100
>> > articles
>> >> > at one time, may also be too many, so you may have to have them
> update
>> > in
>> >> > batches of 10 or possibly do the updates one at a time as the
> articles
>> > are
>> >> > posted to SQL Server.
>> >> >
>> >> > Now, that I have more information, another factor here may also be
> the
>> >> > fact
>> >> > that you are using XML data for the article description and
>> >> > depending
>> > upon
>> >> > the datatype (varchar, TEXT, nchar or IMAGE, etc.) as well as the OS
>> >> > Platform (Win2K or Win2003 ?), you may not be getting the best
> results,
>> >> > i.e.. finding the expectant results as both the datatype and OS
>> >> > Platform
>> >> > have issues that can be resolved via using an XML IFilter along with
>> >> > storing
>> >> > the XML data in an IMAGE datatype. Could you provide more
>> >> > information
>> >> > on
>> >> > this? Specifically the OS Platform and the datatype that the XML
>> >> > description
>> >> > is stored in?
>> >> >
>> >> > Thanks,
>> >> > John
>> >> >
>> >> >
>> >> >
>> >> > "John" <John@Comprompter.com> wrote in message
>> >> > news:uLvn5CkqEHA.2588@TK2MSFTNGP12.phx.gbl...
>> >> >> Thanks for the questions,
>> >> >>
>> >> >> The client has told me that they have SQL Server SP3 installed
>> >> >> (8.00.760).
>> >> >>
>> >> >> I'm not sure I completely understand your question about the
> frequency
>> > of
>> >> >> updates. The client does almost exactly the same thing everyday
> (it's
>> > a
>> >> >> television news station). Each story is an article. The data
> stored
>> > in
>> >> > the
>> >> >> full-text column is an xml description of the story, and is
> generally
>> >> >> between 1K and 6K in size (due to markup and extra non-text data).
>> >> >>
>> >> >> They will go for several days without this problem occurring, but
> then
>> >> > they
>> >> >> suddenly realize that they need to restart the search service
>> >> >> (apparently,
>> >> >> on rarer occasions, the propogation won't work correctly until the
>> > server
>> >> > is
>> >> >> rebooted). I can't verify the actual number of rows being used per
>> > day,
>> >> > but
>> >> >> I can say without hesitation, that each day is almost exactly the
> same
>> > as
>> >> >> every other day in terms of content size and number of updates.
>> >> >>
>> >> >> Thanks,
>> >> >> John
>> >> >>
>> >> >> "John Kane" <jt-kane@comcast.net> wrote in message
>> >> >> news:%23mAO73jqEHA.868@TK2MSFTNGP10.phx.gbl...
>> >> >> > John,
>> >> >> > Thanks for the additional information, that your client has
> "Change
>> >> >> > Tracking" and "Update Index in Background" as this indicates that
>> > they
>> >> >> > have
>> >> >> > SQL Server 2000. Could you provide some additional info?
>> > Specifically,
>> >> > the
>> >> >> > full output of -- SELECT @@version -- as this is helpful in
>> >> >> > troubleshooting
>> >> >> > SQL FTS issues such as this.
>> >> >> >
>> >> >> > The primary issue relates around two facts: (1) recent data
>> >> >> > (anywhere
>> >> > from
>> >> >> > 5
>> >> >> > minutes to 5 hours) is not being found when they perform a search
>> >> >> > and
>> >> > (2)
>> >> >> > which is generated at the rate of 100 - 500 articles per day.
> Could
>> >> >> > you
>> >> >> > verify that your client is updating, inserting or deleting 100 -
> 500
>> >> >> > articles (or rows in the FT-enable table) per day? If so, then
> this
>> >> >> > rate
>> >> >> > of
>> >> >> > updates, my be too frequently or too many updates for the
>> >> >> > MSSearch
>> >> > service
>> >> >> > to push to the FT Catalog in a reasonable timeframe.
>> >> >> >
>> >> >> > There have been several email threads on this newsgroup that
> related
>> > to
>> >> >> > the
>> >> >> > number (total number of rows being updated) and frequency (the
>> >> >> > volume
>> >> >> > of
>> >> >> > rows being updated over time) and that the SQL Server-to-MSSearch
>> >> >> > update
>> >> >> > process cannot keep up with the *expected* volumes and post the
>> > updates
>> >> >> > into
>> >> >> > the FT Catalog in a reasonable amount of time. I have a simple
> repro
>> > of
>> >> >> > this
>> >> >> > for a one row update that should post to the FT Catalog in 1
> second,
>> >> >> > but
>> >> >> > does not post until at least 5 to 7 seconds for only one row and
>> >> >> > I
>> > have
>> >> >> > forwarded this to Microsoft as well as others have opened support
>> > cases
>> >> >> > with
>> >> >> > Microsoft on this issue. At this time, I do not know the status
>> >> >> > of
>> > this
>> >> >> > issue, or if this has been confirmed as a bug or as a PRB (a
>> >> >> > known
>> >> > issue,
>> >> >> > but not a bug).
>> >> >> >
>> >> >> > Thanks and any additional information you can provide would be
>> > helpful!
>> >> >> > John
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > "John" <John@Comprompter.com> wrote in message
>> >> >> > news:OJFaiJjqEHA.592@TK2MSFTNGP11.phx.gbl...
>> >> >> >> I can't answer that definitively, but the client has told me
>> >> >> >> that
>> > they
>> >> >> > found
>> >> >> >> no relevant error messages...
>> >> >> >>
>> >> >> >> Reading my previous post, I noticed two things that I should
> clear
>> > up.
>> >> >> >>
>> >> >> >> 1) The larger catalog uses a table with a timestamp column
> (titled
>> >> >> >> "LastChanged"). The smaller catalog refers to a table without a
>> >> >> >> timestamp
>> >> >> >> column.
>> >> >> >>
>> >> >> >> 2) The proper terms for the index settings that are being
>> >> >> >> used --
>> >> > change
>> >> >> >> tracking, and background update (with no scheduled populations).
>> >> >> >>
>> >> >> >> Thanks,
>> >> >> >> John
>> >> >> >>
>> >> >> >>
>> >> >> >> "Hilary Cotter" <hilary.cotter@gmail.com> wrote in message
>> >> >> >> news:uDAceBjqEHA.1164@TK2MSFTNGP10.phx.gbl...
>> >> >> >> > are there any errors or warnings in the event log from MSSCi,
> or
>> >> >> > MSSearch?
>> >> >> >> >
>> >> >> >> > --
>> >> >> >> > Hilary Cotter
>> >> >> >> > Looking for a SQL Server replication book?
>> >> >> >> > http://www.nwsu.com/0974973602.html
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > "John" <John@Comprompter.com> wrote in message
>> >> >> >> > news:eyTMg7iqEHA.3848@TK2MSFTNGP14.phx.gbl...
>> >> >> >> >> Hi,
>> >> >> >> >>
>> >> >> >> >> I've got what I believe to be an unusual situation. One of
> our
>> >> >> >> >> clients
>> >> >> >> >> is
>> >> >> >> >> reporting that they need to randomly restart the MSSearch
>> > service.
>> >> >> > They
>> >> >> >> >> detect this need by noticing that recent data (anywhere from
>> >> >> >> >> 5
>> >> > minutes
>> >> >> > to
>> >> >> >> > 5
>> >> >> >> >> hours) is not being found when they performa a search.
> However,
>> >> >> >> >> the
>> >> >> >> > search
>> >> >> >> >> itself succeeds, returning the old data that matches their
>> >> >> >> >> query.
>> >> >> >> >>
>> >> >> >> >> Apparently, the index has decided to stop updating for
> whatever
>> >> >> >> >> reason,
>> >> >> >> > and
>> >> >> >> >> giving the MSSearch service a kick will correct the issue for
> a
>> >> > while.
>> >> >> >> >>
>> >> >> >> >> What (if anything) can be done about this? The client
>> >> >> >> >> reports
>> > that
>> >> >> > there
>> >> >> >> > is
>> >> >> >> >> no relevant information in the event logs.
>> >> >> >> >>
>> >> >> >> >> Here is a description of the setup:
>> >> >> >> >>
>> >> >> >> >> A single-CPU machine, average speed, no special hardware
>> >> >> >> >> considerations
>> >> >> >> > for
>> >> >> >> >> the database. The database and the full-text index reside
>> > together
>> >> > on
>> >> >> > a
>> >> >> >> >> single dedicated drive. The full-text index is set to do
>> >> >> >> >> background
>> >> >> >> >> updating, incremental poplution. I am not sure whether both
>> >> > manifest
>> >> >> > the
>> >> >> >> >> same behavior simultaneously. One catalog deals with years
>> >> >> >> >> of
>> >> >> >> >> client-created data, which is generated at the rate of 100 -
> 500
>> >> >> > articles
>> >> >> >> >> per day. The other catalog deals with externally generated
>> >> >> > information,
>> >> >> >> > and
>> >> >> >> >> only retains one week of that data. (Every evening it purges
> the
>> >> >> > content
>> >> >> >> >> older than a week.)
>> >> >> >> >>
>> >> >> >> >> Please offer whatever clues you can, even if you think it may
> be
>> > an
>> >> >> >> >> incomplete answer.
>> >> >> >> >>
>> >> >> >> >> John
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >
>> >> >> >> >
>> >> >> >>
>> >> >> >>
>> >> >> >
>> >> >> >
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >>
>> >>
>> >
>> >
>>
>>
>
>
>



Relevant Pages

  • Re: Another additional DC question
    ... Thanks John, ... RRAS server? ... Site B = Branch site where clients use MS VPN client to remote to Site A ... Clients use VPN, why not have them log on to the domain that Site A hosts. ...
    (microsoft.public.cert.exam.mcse)
  • Re: Query RelationshipMultiple Join
    ... the query still shows only the clients that have two records. ... "John Vinson" wrote: ... >>has two entries in the second table. ...
    (microsoft.public.access.queries)
  • Re: [Full-Disclosure] "tired of spam? time to fight back!" or fightspam.nm.ru
    ... but basically illegal...it's a DDOS just the clients are ... compromised via social engineering not some nifty leet hack. ... John ...
    (Full-Disclosure)
  • Re: Prickly City Jan 26
    ... they did before they became his clients, and mostly to obvious people like their local congressman or one on a particular committee they dealt with. ... And, again, LESS than they gave before they became Abramoff clients. ... Just ask John McCain, Max Cleland, ...
    (rec.arts.comics.strips)
  • Re: Bugs found in Asp.Net application (SERIOUS)
    ... But I do tell my clients not to click ... As for other bug, I just plug in a dummy item that sits on ... John ...
    (microsoft.public.dotnet.framework.aspnet)