Re: Best way to parse/walk-thru WORD document using VBA?

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance

From: Word Heretic (myfullname_at_tpg.com.au)
Date: 09/30/04


Date: Thu, 30 Sep 2004 18:51:57 +1000

G'day "Jeff J Jones" <jjones#NOSPAM#@jonestc.com>,

Define parse and extract please. The current scope is literally a full
blown NLP and I find that extremely unlikely.

Usually specific data items are located by one or a combination of
these methods:

A fixed offset in a table. Eg, table 1, row 1, always has the Shrink's
name in it.

A keyword in a column of any table. Eg any table row whose cell 1
reads "Shrink's Name" has the Shrink's name in cell 2.

A bookmark of fixed name or prefix over a range.

Inline keywords. Eg, there is a heading somewhere, god knows what
style, that read's "Shrink's Name"

A specific character or paragraph style.
 
A tighter spec would make life easier mate. Why not

I have a document that has a Shrink's name in it. I don't know where I
can find this. Here are some examples of the context within which it
will be found.

Steve Hudson - Word Heretic
Want a hyperlinked index? S/W R&D? See WordHeretic.com

steve from wordheretic.com (Email replies require payment)

Jeff J Jones reckoned:

>Hi,
>
>I have a large WORD document that I need to parse and extract the
>information from.
>
>The information is formatted by sections, headers and tables.
>
>I have only briefly visited the object hierarchy for WORD and so am looking
>for insight into just how best to walk this document and extract the
>information.
>
>For example, is there a way to move from cell to cell in a table?
>
>Thanks much
>
>



Relevant Pages

  • Parse an email address
    ... I need to parse a email address to get the url. ... Given a cell containing: ... I need to extract: ... MJC ...
    (microsoft.public.excel.worksheet.functions)
  • Parse out data in a cell.
    ... In a datagrid, I want to be able to extract a cell, and parse out the data ... Cell8 returns the full telephone number. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Getting all rows of data that have a value in a particular col
    ... So you want to extract data where there is a value greater than 0 in *BOTH* ... What is the cell address where the table starts? ... county code sales tax ... "Biff" wrote: ...
    (microsoft.public.excel.misc)
  • Looking to trees to treat rare eye cancer
    ... The cancer usually develops in children under age 6 and kills within two ... O'Brien and colleagues at UCSF wanted to see whether the tree bark extract ... apoptosis, or programmed cell death. ... No longer recruiting ARQ 501 in Combination With Docetaxel in Patients ...
    (sci.med.diseases.cancer)
  • Re: Speed of using OLE for automation
    ... This functions extract all the ... Additionally, style, font size, page and line number is extracted. ... The challenge is that I need to parse large documents with my ... works closely with the Word object model into VBA procedures. ...
    (microsoft.public.office.developer.automation)