Re: Reading textfiles line by line
- From: "Patrice" <nobody@xxxxxxxxxxx>
- Date: Wed, 25 May 2005 21:02:04 +0200
IMO it will never change. For example if I've got a product who handles a
first field with 10 chars and another with 4 chars, the vendor could decide
to export its data as a text file using 10 chars for first field and 4 chars
for the second fields...
Whatever the data are it will never change and each line will have always 14
chars... There is no need to ever use more characters as the first field
can't have more than 10 and the other can't have more than 4...
IMO this is the way this file works..
This is know as "fixed size" fields files (as the size of each field never
change) opposed to "delimited" in which you have a delimiter between
fields...
Patrice
--
"jamait" <jamait@xxxxxxxxxxxxxxxxxxxxxxxxx> a écrit dans le message de
news:4BC203B1-40F1-4D24-AC86-CAD43FAC99E2@xxxxxxxxxxxxxxxx
> So by getting the positions for the columns on the first row i can use
these
> in the following lines?
>
> I think that the column width is set by the max length of the 2nd
> column...so this would probably change if the description in any of the
rows
> is longer. The file is an export from a different software package that I
am
> not able to change.
>
> Is there a way to replace 2 or more spaces with a separator or is there a
> better way to split the columns?
>
> Thanks
>
> /Martin
>
> "Patrice" wrote:
>
> > Sorry but I'm afraid I still don't catch the exact problem. The two
lines
> > you showed us are using the same format.
> >
> > There is NO separator. Each field uses a *fixed* width :
> >
> > Copy the two lines below :
> > A0012430 REKAL TVÄTTMEDEL EKOMAX 0,5L ST 75.9000
> > A0012550 REKAL TVÄTTMEDEL BIOKULÖR 20KGST 1727.0000
> >
> > Paste this into notepad and use the courier new font.
> >
> > You'll see that 5L and 20KG are starting at the same location (and
always 4
> > characters wide). ST and ST starts at the same location.
> > The last field doesn't but is right justified in a field that begins at
some
> > unknown location (this is the only problem I see, you'll have to find
out
> > the *fixed* length of the third field so that you can start reading the
4 th
> > field from the correct position).
> >
> > Patrice
> >
> >
> > --
> >
> > "jamait" <jamait@xxxxxxxxxxxxxxxxxxxxxxxxx> a écrit dans le message de
> > news:00D6D172-ABF6-4735-AEFD-6F8D197D1EB4@xxxxxxxxxxxxxxxx
> > > Hi again,
> > >
> > > It is not a one time operation and what I really want to do is to
split
> > each
> > > line into 4 parts...
> > >
> > > First bit is a ID field, second field a description, third is the unit
and
> > > is in the format of an known char array. The last column is the price
of
> > the
> > > product.
> > > Unfortunately when opening this file which is opened as plain text and
> > read
> > > line by line the 2nd and 3rd column somehow merges at some of the
rows...
> > > The merging only occurs where the description field is long enough and
> > > probably at the position of the longest description in the file...The
unit
> > of
> > > the line is then appended to the line without any delimeter whereas
the
> > main
> > > problem.
> > > The other columns in the file are separated by more than 1 white space
> > > between them.
> > >
> > > I am thinking of using some kind of regular expression and extracting
the
> > > information wanted line by line but not sure how to split the 2nd and
3rd
> > > column.
> > > A0012430 REKAL TVÄTTMEDEL EKOMAX 0,5L ST 75.9000
> > > A0012550 REKAL TVÄTTMEDEL BIOKULÖR 20KGST 1727.0000
> > >
> > > Also updated the StreamReader to use the default encoding to display
the
> > > swedish characters properly...
> > >
> > >
> > > Help...
> > >
> > > /Martin
> > >
> > > > >> StreamReader sr = new StreamReader(file,
> > System.Text.Encoding.Default);
> > > > >> string line = "";
> > > > >> while ((line = sr.ReadLine()) != null)
> > > > >> {
> > > > >> if(line.Length > 17)
> > > > >> {
> > > > >> DataRow dr = m_Data.NewRow();
> > > > >> dr["Col1"]= line.Substring(0, 17).Trim();
> > > > >> dr["Col2"] = ?
> > > > >> dr["Col3"] = ?
> > > > >> dr["Col4"] = ?
> > > > >> m_Data.Rows.Add(dr);
> > > > >> }
> > > > >> }
> > > > >> sr.Close();
> > >
> > >
> > >
> > >
> > > "Dave" wrote:
> > >
> > > > > The original file include approx. 5000 lines
> > > >
> > > > Is this a one-time operation? If you are using SqlServer you can
write
> > a simple DTS package to do the transformation or just use
> > > > the Import Data command in Enterprise Manager.
> > > >
> > > > --
> > > > Dave Sexton
> > > > dave@xxxxxxxxxxxxxxxxxxx
> > >
> -----------------------------------------------------------------------
> > > > "Patrice" <nobody@xxxxxxxxxxx> wrote in message
> > news:unk3M9QYFHA.616@xxxxxxxxxxxxxxxxxxxxxxx
> > > > > Not sure what you meant by "the position seem to be the same
depending
> > on
> > > > > the text in col2" ?
> > > > >
> > > > > What if you try to display this file using a fixed width font such
as
> > > > > courier new ? Are all fields aligned ?
> > > > >
> > > > > To me it looks like this is a fixed width file. Each column uses
> > always the
> > > > > same range of characters on each line (but it may not be visible
> > immediately
> > > > > when using a proportional font).
> > > > >
> > > > > Patrice
> > > > >
> > > > > --
> > > > >
> > > > > "jamait" <jamait@xxxxxxxxxxxxxxxxxxxxxxxxx> a écrit dans le
message de
> > > > > news:8EEA78EC-6F35-4C62-BC94-592A262EFA77@xxxxxxxxxxxxxxxx
> > > > >> Hi all,
> > > > >>
> > > > >> I m trying to read in a text file into a datatable...
> > > > >>
> > > > >> Not sure on how to split up the information though, regex or
> > > > > substrings...?
> > > > >>
> > > > >> sample:
> > > > >> Col1 Col2
> > Col3
> > > > >> Col4
> > > > >> A0012430 REKAL TVÄTTMEDEL EKOMAX 0,5L ST 75.9000
> > > > >> A0012550 REKAL TVÄTTMEDEL BIOKULÖR 20KGST 1727.0000
> > > > >>
> > > > >> Notice how the 2nd row has merged col2 and col3. There are no
> > delimeter at
> > > > >> all but the position seem to be the same depending on the text in
> > col2.
> > > > > The
> > > > >> original file
> > > > >> include approx. 5000 lines that I want to update a sql table
with.
> > The
> > > > > above
> > > > >> problem
> > > > >> occurs at many positions in the text file.
> > > > >>
> > > > >> I have successfully read in data from the text file using this
code:
> > > > >>
> > > > >> <code>
> > > > >> StreamReader sr = File.OpenText(fileName);
> > > > >> string line = "";
> > > > >> while ((line = sr.ReadLine()) != null)
> > > > >> {
> > > > >> if(line.Length > 17)
> > > > >> {
> > > > >> DataRow dr = m_Data.NewRow();
> > > > >> dr["Col1"]= line.Substring(0, 17).Trim();
> > > > >> dr["Col2"] = ?
> > > > >> dr["Col3"] = ?
> > > > >> dr["Col4"] = ?
> > > > >> m_Data.Rows.Add(dr);
> > > > >> }
> > > > >> }
> > > > >> sr.Close();
> > > > >>
> > > > >> Any help appriciated!
> > > > >>
> > > > >> /Martin
> > > > >>
> > > > >
> > > > >
> > > >
> > > >
> > > >
> >
> >
> >
.
- References:
- Reading textfiles line by line
- From: jamait
- Re: Reading textfiles line by line
- From: Patrice
- Re: Reading textfiles line by line
- From: Dave
- Re: Reading textfiles line by line
- From: jamait
- Re: Reading textfiles line by line
- From: Patrice
- Re: Reading textfiles line by line
- From: jamait
- Reading textfiles line by line
- Prev by Date: Re: HELP! New to XML - simple solution needed
- Next by Date: Catalyst Releases Scripting Editions of SocketTools
- Previous by thread: Re: Reading textfiles line by line
- Next by thread: Re: Reading textfiles line by line
- Index(es):
Relevant Pages
|