Re: "IFilter"ed WordML Document uses "plain text" vs. "Office" fil
- From: "Hilary Cotter" <hilary.cotter@xxxxxxxxx>
- Date: Wed, 12 Oct 2005 10:08:58 -0400
Hi Christian
I am looking at your WordML docs. These seem to be interpreted correctly.
What are you looking for that is not showing up?
Could you also please send me the results of this registry key?
HKEY_CLASSES_ROOT\.xml\PersistentHandler
Look at the GUID there, go to HKEY_CLASSES_ROOT\CLSID
Chase it down a little further until you get as far as you can.
What happens with me is I get a GUID of
{7E9D8D44-6926-426F-AA2B-217A819A5CCE}
In the first iteration in CLSID I have a PersistentAddinsRegistered sub key
of {89BCB740-6119-101A-BCB7-00DD010655AF} which has a value of
{41B9BE05-B3AF-460C-BF0B-2CDD44A093B1}
Chasing this down I see
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\CLSID\{41B9BE05-B3AF-460C-BF0B-2CDD44A093B1}]
@="XML Content Filter Class"
[HKEY_CLASSES_ROOT\CLSID\{41B9BE05-B3AF-460C-BF0B-2CDD44A093B1}\InprocServer
32]
@="C:\\WINNT\\System32\\XMLFIL~1.DLL"
"ThreadingModel"="Free"
[HKEY_CLASSES_ROOT\CLSID\{41B9BE05-B3AF-460C-BF0B-2CDD44A093B1}\ProgID]
@="Search.XmlContentFilter.1"
[HKEY_CLASSES_ROOT\CLSID\{41B9BE05-B3AF-460C-BF0B-2CDD44A093B1}\VersionIndep
endentProgID]
@="Search.XmlContentFilter"
--
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Christian Dehaeseleer" <ChristianDehaeseleer@xxxxxxxxxxxxxxxxxxxxxxxxx>
wrote in message news:3ACC0598-201C-479B-ADC3-8C418BA72FAE@xxxxxxxxxxxxxxxx
> The extension for WordML docs is "XML".
> The fact that it is a "Word" XML document is detected by some
> applications(ie; Word and IE) via a custom processing instruction in the
XML
> itself.
>
> You will find samples here:
> http://www.xmlinoffice.com/allpart2.zip
> (for instance: "article WordML.xml")
>
> Thanks for your help!
>
> --
> Best regards,
> Christian Dehaeseleer.
>
>
> "Hilary Cotter" wrote:
>
> > What is the extension of the wordml docs? Could you post one here or
send it
> > to me offline?
> >
> > --
> > Hilary Cotter
> > Looking for a SQL Server replication book?
> > http://www.nwsu.com/0974973602.html
> >
> > Looking for a FAQ on Indexing Services/SQL FTS
> > http://www.indexserverfaq.com
> >
> > "Christian Dehaeseleer" <ChristianDehaeseleer@xxxxxxxxxxxxxxxxxxxxxxxxx>
> > wrote in message
news:BC9C48BA-71FF-407C-8C3C-707AC8CB0442@xxxxxxxxxxxxxxxx
> > > Scenario:
> > > Indexing the text of a WordML document (saved as XML - NOT "Data
only")
> > > obtained via IFilter interface
> > > (product version: Office Word 2003 SP1)
> > >
> > > Actual result:
> > > The extraction of text is done via the "Plain text" filter (in
> > > "query.dll").
> > >
> > > Issue:
> > > All the "non-content" metadata of the WordML is extracted along
with
> > > the
> > > "content" data
> > >
> > >
> > > ***** Question: How to make sure that the extracting is done via the
> > > "Office" Filter (in "offfilt.dll") ?
> > >
> > > Ideally, a solution where "XML" is registered to be handled
(correctly) by
> > > the Office Filter is better than having to "manually" program the
> > > extraction...
> > >
> > > Thanks for all help and pointers!
> > >
> > >
> > > --
> > > Best regards,
> > > Christian Dehaeseleer.
> > > P.S. This was posted to "Office.Office.XML" earlier, but did not get
any
> > > answer...
> > >
> >
> >
> >
.
- References:
- Re: "IFilter"ed WordML Document uses "plain text" vs. "Office" filter
- From: Hilary Cotter
- Re: "IFilter"ed WordML Document uses "plain text" vs. "Office" fil
- From: Christian Dehaeseleer
- Re: "IFilter"ed WordML Document uses "plain text" vs. "Office" filter
- Prev by Date: Re: How to manage Files with passwords
- Next by Date: Re: Custom IFilter works with XP, but not with Win 2000
- Previous by thread: Re: "IFilter"ed WordML Document uses "plain text" vs. "Office" fil
- Next by thread: When Windows do search...
- Index(es):
Loading