Re: newbie: Full Text Search Against PDF Blobs
- From: "Hilary Cotter" <hilary.cotter@xxxxxxxxx>
- Date: Mon, 13 Aug 2007 22:58:21 -0400
With Lucene you really have to roll your own solution, all it is, is a
full-text search engine. You have to write code to query it and to feed
documents to it to index these documents. Lucene is designed for the 5-10
million document range, but can be scaled much higher. It is optimized to
return results in batches to 10, 20, 25 or 100 results. If you return all
results its performance is much worse than SQL FTS.
Lucene allows you to so true property based searches.
SQL FTS is highly scalable but you really have to think about partitioning
after you hit 50 million rows.
You really have to test to see what works best in your environment.
--
relevantNoise - dedicated to mining blogs for business intelligence.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Des" <support@xxxxxxxxxxxxxx> wrote in message
news:%235elAfg3HHA.2064@xxxxxxxxxxxxxxxxxxxxxxx
I have a client which this solution sounds perfect for:
Does anyone know a web site that I can test the "Full Text Search Against
PDF Blobs" functionality against.
The website "layout design guy" is saying that SQL 2005 will be too slow
and we should use "Lucene", an open source indexer instead.
Does anyone have any info I can use to show this guy that SQL Server 2005
will be faster?
The target site has several thousand Report PDFs at 2Mb average each
(about 10GB in total).
I have watched the video
"http://download.microsoft.com/download/b/3/8/b3847275-2bea-440a-8e2e-305b009bb261/sql_13.wmv"
that was referenced in this group recently.
Thanks,
Des
.
- Follow-Ups:
- References:
- Prev by Date: newbie: Full Text Search Against PDF Blobs
- Next by Date: Re: SQL 2005 Cluster with dedicated nic for Sharepoint DB
- Previous by thread: newbie: Full Text Search Against PDF Blobs
- Next by thread: Re: newbie: Full Text Search Against PDF Blobs
- Index(es):
Relevant Pages
|
Loading