2nd Try: Extracting data from Word 2003 using XSLT
- From: "vasdeep" <vasdeep@xxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: Wed, 18 May 2005 09:53:41 -0700
Could some one please respond.....
"vasdeep" wrote:
> Sorry, my earlier email was incomplete.
>
> I meant to ask...
> Could someone please help me with these issues?
>
> Thanks in Advance.
>
> vasdeep
>
>
> "vasdeep" wrote:
>
> > I have the following requirement.
> > 1. User chooses to download the contents captured in our system as a word
> > document. I have written an XSLT that converts the system
> > data [in XML format] into WordML. To identify the contents of this
> > document, I
> > also create hidden tags as part of this Transformation.
> > Using Word property, "w:visible", I add these tags . For example it
> > could be something like <vasuclause> </vasuclause>
> > These tags will not be visible to the end-user when viewed from MS
> > Word 2003. The contents between these tags will be the text that the user
> > will see and
> > it represents an entity in our system. So, there will be many occurrences of
> > this pattern corresponding to the entities in the system.
> > But when Word 2003 tries to open the WordML created this way, it will
> > try to process the tag "<vasuclause>" and error out. To avoid this, as part
> > of my
> > transformation I escape the "<" and ">" tags, resulting in "%lt;vasuclause>"
> > and
> > "%lt;/vasuclause>"
> >
> > Question 1: Is this approach right?
> >
> > The user can now open the document in Word 2003.
> >
> > 2. The user can make changes to the document and upload it back to the
> > system.
> > I have written another XSLT that reads the contents of the document
> > and extracts the text including the formatting.
> > i.e. if the user has used bold, italics, underline, lists, paragraphs
> > etc, all that information is captured as part of this transformation. The
> > XSLT has relevant templates to convert the run properties for "bold",
> > "italics", "lists", etc to
> > their HTML equivalents. The XSLT outputs an XML document containing the text
> > with relevant formatting instructions.
> > As part of this, I need to read the tags that I had inserted
> > "vasuclause" and using that identify each entity.
> > Question 2: The XML output will be something
> > like
> > <vasuclause>This is the first paragraph with <b> bold </b>
> > </vasuclause>
> > The second stage of this process is to use a SAX parser to read
> > the XML and insert the data into the database. But the XML output is not
> > correct. It has
> > the tags escaped. Is there a way to resolve this.
> >
> >
> > The examples available in mdsn talk about using XSD for defining the Data
> > Definition for the word document and defining blocks where the user can
> > provide input. Then the user can save the file by choosing "Save Data Only"
> > option. But this does not work for my requirement. Doing so saves only the
> > "data(text)" and
> > loses the formatting. I need to capture both the data and the formatting.
> >
> > Also these examples talk about user providing data into specific input
> > blocks. In my case, the user can add new paragraph texts. Hence, I have not
> > been able
> > to use the suggested solution. Is there something that I'm missing?
> >
> > Is there a way to achieve what I wish to accomplish? Will the procedure I
> > have explained above work? Is there a better way.
.
- Follow-Ups:
- Re: 2nd Try: Extracting data from Word 2003 using XSLT
- From: Word Heretic
- Re: 2nd Try: Extracting data from Word 2003 using XSLT
- References:
- Extracting data from Word 2003 using XSLT
- From: vasdeep
- RE: Extracting data from Word 2003 using XSLT
- From: vasdeep
- Extracting data from Word 2003 using XSLT
- Prev by Date: Re: Save as a word document
- Next by Date: Disable COM Add-in
- Previous by thread: RE: Extracting data from Word 2003 using XSLT
- Next by thread: Re: 2nd Try: Extracting data from Word 2003 using XSLT
- Index(es):
Relevant Pages
|
Loading