Catalog Population VS Database Backup
From: RGondzur (rgondzur_at_NO_SPAM_aicsoft.com)
Date: 03/19/04
- Next message: John Kane: "Re: Catalog Population VS Database Backup"
- Previous message: Jim Roberts: "Re: An unknown full-text failure (80004005) occurred in function EnumCatalogs"
- Next in thread: John Kane: "Re: Catalog Population VS Database Backup"
- Reply: John Kane: "Re: Catalog Population VS Database Backup"
- Reply: RGondzur: "Re: Catalog Population VS Database Backup"
- Messages sorted by: [ date ] [ thread ]
Date: Fri, 19 Mar 2004 16:01:09 -0500
MS Windows 2000 Adv Server
SQL Server 2000
The FTS catalog is stored on the same box as SQL Server.
There are two applications that access a SQL database.
I have one application that stores data from an OCR process in a SQL table.
The table includes a timestamp column.
FTS indexes one text column on this table.
The table currently holds 500,000 rows.
I have FTS programmed for incremental population.
I currently perform OCR on images in batches of 500 items at a time.
The OCR data is inserted into the SQL table as it is processed.
At the completion of the batch, I then start the catalog population to index
the text added to the table.
Each population is taking approximately 20 minutes.
During the population process, CPU utilization on the SQL box hits 80+
percent.
Users do not access this application directly and it runs on its own box on
the network.
The second application, the one being accessed by users, searches against
the FT Search catalog looking for text contained in the images that were
OCRed
.
Since the first app can initiate a catalog population event at the end of
OCR batch process, it is causing poor performance for
all users of the second app while it is indexing. Users are complaining of
slow response.
I am considering moving the catalog population to one scheduled job that
runs nightly and indexes all of the images that were processed during the
day. This will likely cause a population to take 6-8 hours per night.
In addition, the nightly database backup takes 5-6 hours per night.
Q1.
Is there a way to minimize the impact of the catalog population on the
database using the existing method that starts the indexing process after
each batch?
I like the idea of indexing small batches as they are processed and keeping
the indexing time to approximately 20 minutes
per batch. The drawback is the poor response for all users during this time.
Q2.
If I go to one scheduled population job nightly, what affect does the
catalog population job have on the database backup job when the two overlap?
I'm afraid that both jobs running concurrently will greatly increase their
completion time and they won't finish before the start of the new day.
- Next message: John Kane: "Re: Catalog Population VS Database Backup"
- Previous message: Jim Roberts: "Re: An unknown full-text failure (80004005) occurred in function EnumCatalogs"
- Next in thread: John Kane: "Re: Catalog Population VS Database Backup"
- Reply: John Kane: "Re: Catalog Population VS Database Backup"
- Reply: RGondzur: "Re: Catalog Population VS Database Backup"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|