Re: Parsing Multipart formdata



The other part of "multipart" is MIME. If you Google various MIME
details, you can find lots more information.

Basically, it works like this:

1. Read the header, look for the "boundary" tag.

2. Read the string out of the boundary tag.

3. Keep reading the header, until you found the boundary, the string in
the boundary tag (in this case, it's
"---------------------------3765104465873" and it's normally going to
be a whole pile of hyphen characters followed by a number, just like
that.

4. Start reading the part header, save that for future reference.

5. Keep reading part header until you find a newline. What a newline
looks like depends on your system, and there doesn't seem to be a
standard. It'll be some collection of \n (newline) and \r (carriage
return).

6. Start reading data into a string buffer.

7. Stop reading into the string buffer when you see the boundary again.

8. Un-Base64-encode the contents of your string buffer. This should
give you an array of bytes. The array of bytes is your binary data. I
seem to remember there being a framework Base64 codec.

As you see, your data isn't really binary. It's Base64, which
constitutes a text (I believe ASCII, but I'm a little rusty on that)
representation of the binary data. Rip is out of the headers and decode
it to get your stream of bytes, then you can write them to disk or
whatever.

HTH. Please ask questions if any of those steps don't make sense to
you. I have done this many times, likely as have many others who read
this board. There are lots of little nuances that can make or break
your application.


Stephan



Cuong.Tong@xxxxxxxxx wrote:
Greeting,
I am writing my own web server and having some problme parsing the the
mulitpart/form-data
stream that is sent from the browsers.



I have a form looks something like this

<form action="process.dll>
<input type=file name=fileupload> </input>
</form>

So when I choose the local file from the browser, and click submit it
will take me to the process.dll file.

The browser will send a post request to the server with the Headers
looks something like this

-------------Start REQUEST Headers--------------
Content-Length : 28624
Content-Type : multipart/form-data;
boundary=---------------------------3765104465873
Connection : keep-alive
Cookie : SESSION=cPnKc7PmT8wdsy+:ccPnKlJF1Af1d
Host : localhost:9000
Referer : http://localhost:80/ajaxupload.html
User-Agent : Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1)
Gecko/20061010 Firefox/2.0
url : /backend/fileupload/test
Accept-Language : en-us,en;q=0.5
Accept-Charset : ISO-8859-1,utf-8;q=0.7,*;q=0.7
Accept :
text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Encoding : gzip,deflate
Keep-Alive : 300
method : POST
-----------------------------3765104465873
Content-Disposition: form-data; name="filename"; filename="review
form.doc"
Content-Type: application/msword

Some binary contetn blah blah

-----------------------------3765104465873--

I can get my stream reader to read up to the application/msword, or in
another word begning of the binary stream, however I have no way to
know how many bytes to read in.. or the length of the binary content of
the current part.

Please note I have no access to ASP.NET library as i am using my own
webserver.

Any hints and/or comments are appreciated.






Regard,

.



Relevant Pages

  • Re: How to determine integer values in a txt file?
    ... the fixed header, skip 2 bytes and read the 5th byte to determine how many ... The problem boils down to one of positioning and then reading - how it can ... the stream then the task is easy - just read the header in! ... really, quite easy to accomplish. ...
    (comp.lang.java.programmer)
  • Re: confused about word templates
    ... dialog and the ³same as previous² setting in the header and footer settings ... And congratulations on reading the documentation. ... stubborn as I am when I have a formatting problem. ... > textedit files are in word format and open in Word} ...
    (microsoft.public.mac.office.word)
  • Re: A reference to a hash member? (to an objects member variable)
    ... I have an object, which reads a header ... To handle those 2 reading states ... (reading a header; reading a chat string) ... Also, OFFSET, in this case, is not the file offset. ...
    (comp.lang.perl.misc)
  • InputStream unreadable
    ... Reading the header implies requiring ASCII reading, ... BufferedReader on the input stream - most importantly to detect the ... double CRLF denoting the end of the header. ... raw input stream however causes the printing to yeild empty characters. ...
    (comp.lang.java.programmer)
  • Re: How to search files for text string most efficiently?
    ... Is not the maximum size for a string buffer something like 0 to 2 ... >overlap the ends of two chunks separately. ... reading the second, back the byte pointer by at least the size of the ... substring to be found. ...
    (microsoft.public.dotnet.languages.vb)