Processing Word documents



We are going to be receiving MS Word documents (.doc style) that
need to be processed by BizTalk. The Word documents contain
forms from which we need to extract the data which is then
deposited in to a database.

I was thinking that a receive pipeline component could process
the stream from a File adaptor, extract the data from the forms
and build an XML string to the schema we'll be using.

I don't know the first thing about processing Word files.

Has anyone done this? Or something that accomplishes the same
thing? Where do I even start?????

- h
.



Relevant Pages

  • Word/Character Count for a Group of Documents
    ... Does anyone know where I could find some VBA code that would help me extract ... the word and character counts for a group of Word documents in a given ... File Name Word Count Character Count ...
    (microsoft.public.word.vba.general)
  • Re: Extracting Text from Word document
    ... criteria for finding the text to be extracted to give you specific advice. ... services on a paid consulting basis. ... > several MS Word documents and copy/paste text within those ... > extract, and copy that data into the MS Word document that initiated the ...
    (microsoft.public.word.vba.general)
  • Re: Extracting bulk OLE objects word documents
    ... I need to extract all these attachments from the Word documents ... nor any information about the original source. ...
    (microsoft.public.office.developer.automation)
  • Re: WYSIWYG program for beginner
    ... edit them as they can with Word documents, just extract the text in a ... crude sort of way. ... Not true if you have Serif's PagePlus available in its latest version. ...
    (uk.net.web.authoring)

Loading