Re: 1.5 MB input file generates 940MB of data in BizTalkMsgBoxDb



Just for comparison,

I ran the following interchange on an XML file I worked on a while back:

Before Test:

1) Original XML File to split -> 38MB.
Number of items to split in XML file -> 8265
2) Envelope and Document Schema deployed to split the XML file.
3) Receive Port/Location configured with File Adapter, standard XML Receive
Pipeline.
4) Initial Size of Msgbox DB Sql Data File -> 35 MB
Intial Size of Msgbox DB Sql Log File -> 1 MB
Size of files read from Sql Server Management Studio (Database properties
for BizTalkMsgBoxDb)
AutoGrowth of the msgbox log and datafiles to 1 MB.
5) Configured File Send port with filter to subscribe to split messages
published to msgbox by the receive port.

Test Observations:

1) Interchange processing time:
approx: 3-4 minutes
2) Total time -> (all 8265 split messages in Send ports file directory)
approx 10 minutes
3) Each split xml file approx 3KB in size.
4) Final Size of Sql Data File -> 106 MB
Final Size of Sql Log File -> 256 MB

Therefore, I am not surprised by your results, but maybe the final
large size of the your database files is excessive. But the overall
behaviour you observed is normal for this type of operation.

There are large db transactions set up to rollback the operation, if the
process fails,
therefore the large log file size.

Setting Recoverable Interchange processing to true, does not alleviate the
size of the final log file. It will just not cause a total rollback if one
split message ends in failure (the bad split messages will be promoted as
failed messages and can be subscribed to and the good ones will be published
as is to the msgbox).

You did correctly mention that a compact flat file can grow in size. Ten
fold (worst case scenario, average is probably 4 to 6 ) when parsed to xml.

You cannot make a direct comparison, between the initial size of the
incoming file and the final size of your sql data and log files. This is just
the overhead processing that is incurred.

For the excessive size of your files, what is the AutoGrowth Factor of your
msgbox database?
(you can set these on properties for a database in Sql Server Management
studio):
ie: you can set the AutoGrowth factor for your files to 1 to 100%, or to
grow for example by 1 or 100 MB each time more space is needed. Maybe the
Autogrowth factor is set to a large increment for your msgbox database ?
I would recommend setting the initial size of these files a large number.

Look in the BizTalk help for tracking messages. The split messages published
into the messagebox database and successfully processed will not stay there
forever. If you have tracking turned on they will be moved to the tracking
database,otherwise they will be removed from the msgbox. (Make sure your Sql
Server Agent is running)

Look at the Sql Server Agent jobs set up by Biztalk. they are constantly
running in the background to help maintain the msgbox and move messages from
the Msgbox to tracking database).

Not sure how good you are on Sql Server Maintenance, but understand and read
the help on Sql Server Full and Simple backups, how to maintain differential
backups etc, so your log file will not get too big.

I would stick with the pattern you have implemented (Splitting the message
in a Pipeline).
You could as you suggested parse the flat file into a single large xml file
for a Orchestration to possibly process. It will handle it, I have processed
large XML files in orchestration before, but the processing turns into a
singleton pattern with processing only one message at a time in a loop. This
will slow down things for you and it sounds like your subscriber is keeping
up to your exisiting pipeline splitting processing.

Just for interest, for some other implementations on your pattern:

http://blogs.conchango.com/davemorris/archive/2005/02/21/1039.aspx
http://objectsharp.com/blogs/matt/archive/2005/08/31/3209.aspx
http://objectsharp.com/blogs/matt/archive/2005/10/23/3525.aspx


Matt


"Jeff Tressler" wrote:

Have you turned on RecoverableInterchangeProcessing on the
disassembler domponent of the pipeline? Perhaps that will make BizTalk
handle messages one at the time without having to go through ALL the
messages before something can be done?
I will have to look that up, that seems a bit more detailed than I have
delved so far. As far as I know, there are two ways to process a flat file.
1) Have the entire file converted into a single XML document, but my
understanding is when you XML begins to exceed 10MB, BizTalk starts running
into problems and our files will generate about a 50MB XML document
2) Use a custom pipeline to generate one XML per line in the file. This is
the method we are trying to use.

We hope there is some extensive logging we accidently turned on but we do
not understand the BizTalk -> SQL Server configuation very well.
--
There are 10 kinds of people, those who understand binary and those that
don''''''''t
There''''''''s no place like 127.0.0.1
================================
3.14159 + Ice Cream = Pi ala mode


"Jan Eliasen" wrote:

On Fri, 1 Dec 2006 11:23:03 -0800, Jeff Tressler
<JeffTressler@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:

This works well and generates one XML message for each line in the file. The
problem is that running 10,000 records through the process, the 1.5 MB file
results in an increase of the BizTalkMsgBoxDb by 241 MB and the
BizTalkMsgBoxDb_log by 669 MB. It appears that all this disk increase occurs
during the Receive Port processing.
Have you turned on RecoverableInterchangeProcessing on the
disassembler domponent of the pipeline? Perhaps that will make BizTalk
handle messages one at the time without having to go through ALL the
messages before something can be done?

--
eliasen, representing himself and not the company he works for.

Private blog: http://blog.eliasen.dk

Private email: jan@xxxxxxxxxx

.



Relevant Pages

  • XML file to SQL Server table - best way?
    ... We'd like to get the results out of the XML file and get them into a table ... As far as I know the software on the testers does not have ... the ability to directly transfer the data into SQL server. ... location the SQL server can 'see' so it can access and parse the file. ...
    (microsoft.public.sqlserver.xml)
  • Re: extracting results of XML stored proc to file
    ... extended Stored Proc or something, but it'd be a lot of effort.) ... Agent job and execute it from within SQL Server. ... 'Assign the output stream. ... extract them from the results into an XML file ...
    (microsoft.public.sqlserver.xml)
  • Re: Securing a web DB
    ... >name, uid, password) from an XML file and establish connection to the DB. ... >We are planning to host the site with as ISP. ... >integrated login) for the database. ... >get to know the SQL server password. ...
    (comp.security.misc)
  • Re: How to use content of file for xml input?
    ... I used the bulk load mechanism. ... With these mechasim i'm able to read the file into our SQL Server. ... So thanks for your interest in my problem and best regards, ... how a external xml file can be prepared to use ...
    (microsoft.public.sqlserver.xml)
  • Re: Is this Scenario possible with Biztalk server ?
    ... you can setup biztalk to receive the one schema and use the SQL Adapter ... > I have Application which produce and XML file with specific schema, ... > application must communicate with other application that uses SQL Server ...
    (microsoft.public.biztalk.general)