Re: Problem with embedded carriage returns
- From: "Stephany Young" <noone@localhost>
- Date: Fri, 22 May 2009 16:53:23 +1200
You issue stems from your considering a 'newline' to be a single character.
Under Windows, it is, in fact, a pair of characters comprising a Carriage Return character and a Line Feed character in that sequence. It is often referred to as a Cr/Lf pair.
If what you describe is correct then the 'newline' imbbeded between quotes is not a 'newline' at all.
I suspect it is a single character and is either a Cr or a Lf.
Notepad only recognizes a Cr/Lf pair ans a line-break and therefore would show your 'line' unbroken. I further suspect that where the imbeded 'newline' should have been, Notepad would have shown an 'unprintable' character which looks like a hollow rectangle.
The StreamReader.ReadLine() method defines a line as a sequence of characters followed by a Line Feed character, a Carriage Return character or a Cr/Lf pair.
In your case, the 'ReadToEnd' and a split on Environment.NewLine is the appropriate course of action.
"Harry Strybos" <harrystrybos@xxxxxxxxxxxxxxx> wrote in message news:OwZJqYp2JHA.1092@xxxxxxxxxxxxxxxxxxxxxxx
"Herfried K. Wagner [MVP]" <hirf-spam-me-here@xxxxxx> wrote in message news:%23dukvrn2JHA.1712@xxxxxxxxxxxxxxxxxxxxxxx"Harry" <harryNoSpam@xxxxxxxxxxxxxxxxxx> schrieb:I have a .csv file that contains newline chars embedded between quotes in a line of text eg
BSPADV1,John.public,9413,"Sharrock Ashley
TEST STREET 1
TEST NSW 2200",Address Insufficient,,,Mbase Print Report,R7TDKPFMDBCLKE07CGJMFKKW6VVB/21,Sharrock Ashley
There are actually some 19 columns of data but when a StreamReader.ReadLine method tries to read line by line, it only returns data up to the first embedded newline chars. Interestingly, the data does display correctly in Notepad, so I guess Notepad must ignore newline chars inside quotation marks.
What do you mean by "displays correctly"? Notepad just displays the text contained in the file.
Is there anyway to read the above line and get the full line of data? My only thought so far is to use the ReadToEnd method and then try and remove the newline chars between quotes programatically before Spltting on the "real" newline chars.
This would be one possible approach. You may want to take a closer look at regular expressions for simple "parsing" of the text file. Alternatively you may want to read the file line-by-line, analyze each line and concatenate the parts of a row which is split into multiple lines manually. However, the best approach depends on what exactly you want to achieve.
--
M S Herfried K. Wagner
M V P <URL:http://dotnet.mvps.org/>
V B <URL:http://dotnet.mvps.org/dotnet/faqs/>
Thanks Herfreid for you answer and you excellent support of this group.
I have solved the problem by getting the StreamReader to load the entire file into a string var and then doing a buffer.Split(CChar(Environment.Newline)). I then read all the lines from the array produced.
Seems SteamReader.Readline only grabs a row of data upto the first newline char it encounters (kinda makes sense)
The method I have now employed seems to ignore any newline characters enclosed in quotation marks.
Thanks again for your help
Harry
.
- Follow-Ups:
- Re: Problem with embedded carriage returns
- From: Cor Ligthert[MVP]
- Re: Problem with embedded carriage returns
- From: Tom Shelton
- Re: Problem with embedded carriage returns
- References:
- Problem with embedded carriage returns
- From: Harry
- Re: Problem with embedded carriage returns
- From: Herfried K. Wagner [MVP]
- Re: Problem with embedded carriage returns
- From: Harry Strybos
- Problem with embedded carriage returns
- Prev by Date: Re: Problem with embedded carriage returns
- Next by Date: Re: Problem with embedded carriage returns
- Previous by thread: Re: Problem with embedded carriage returns
- Next by thread: Re: Problem with embedded carriage returns
- Index(es):
Relevant Pages
|