Re: Homegrown synchronization
- From: "David W. Fenton" <XXXusenet@xxxxxxxxxxxxxxxxxxx>
- Date: Sat, 23 Dec 2006 15:53:30 -0600
rdemyan@xxxxxxxxxxx wrote in
news:1166892795.770261.217870@xxxxxxxxxxxxxxxxxxxxxxxxxxx:
I'm concerned about the lack of support by Microsoft for database
replication and indirect synchronization.
MS has never been good supporting Jet replication. I can only think
that the reason they aren't trumpeting the existence of such an
incredibly well-implemented capability in one of their products is
because they think it will hurt sales of SQL Server. I think this is
the only reason they've never created a Jet server component (like
Filemaker has), because it would probably blow away SQL Server for
applications with up to 255 users.
I believe that the reason Jet replication is so good is because it
was the dry run for implementing the final SQL Server merge
replication. The development groups that worked on Jet 4 and
replication and on SQL Server overlapped a great deal, if I
understand correctly, and a lot of what went into Jet 4 replication
also went into SQL Server (and heterogeneous replication between Jet
and SQL Server).
They don't even supply
replication manager any more.
This is a vast stupidity, yes.
In our tough IT environment, they don't
allow unapproved third-party software (like TSI Synchronizer).
Further IT has not approved the Jet Synchronizer for use and it
currently cannot be installed. Even if they do eventually allow
it, they may at some point in the future take it away. Also since
the Replication Manager is now no longer provided with Office
Professional 2003, I seriously doubt they will allow its
installation even if they can get a copy.
Show them the Jet Replication security update. Even if you were
doing direct replication, you'd need that, obviously, and that
includes the synchronizers. So, it *is* supplied direct from
Microsoft.
Although Jet direct synchronization is a possibility, a number
of users are on WANs so corruption is possible (many would say
inevitable).
The loss of replicability is the main thing. That happens any time
you compact a replica with any of its "possible corruption" flags
set.
The institution I work with is at the forefront of network
security. Since my app is only used by a relatively small number
of people and is not mission critical, they could care less about
our requirements. Who knows what they will do in the future. So,
I'm thinking about creating my own indirect synchronization code.
Although it will be major work, at least I will own it and nobody
will be able to take it away. My thoughts are to do it
differently than Jet and to not have any features regarding
synchronization of table design or synchronization scheduling.
Sounds like a good plan at this point.
So:
1) Scrap Jet Replication and Synchronization. This is not a
problem since we've only been testing it for use. It was never
implemented in the existing production environment. Start with an
UNREPLICATED back-end file. Each workstation (including laptops)
will have its own "local" backend. In addition, there will be a
backend on a common server that all workstations can connect to.
I still don't understand why LAN-connected machines shouldn't all be
sharing a common back end stored on that LAN's server.
2) All back-end tables currently have a time-stamp field which is
modified if data in the record is changed.
3) Create a "model" database that contains all of the tables from
the back end but is EMPTY of data. This "model" database will be
copied to each user's workstation and be part of my application's
files. When a user updates data on their local backend, code in
my app will create a temporary database from this "model": a
temporary updateDB (temp updateDB). The temp updateDB will
contain all the tables from the backend, but will be initially
empty of data. The code then will write to the temp updateDB only
the records that were added/modified/deleted.
You won't be able to user Autonumber PKs unless you make them part
of a compound key that identifies the source database.
The tables in the temp updateDB, however, will include an extra
field
used by "import" code to decide what to do with the records in a
table in the temp updateDB. I think I can keep this as a Yes/No
field. If 'Yes' then when data from the temp updateDB is imported
into a backend, the record is deleted. If 'No' then the record is
an update to an existing record or addition of a new record.
How does my application know what data to write to the temp
updateDB? Good question. My thinking right now is that I can just
use the TimeStamp field in each table in the backend. For
additions and updates, copy only records where the TimeStamp has
today's date in it (or some such algorithm). Any deletions would
probably need to immediately be written to the temp updateDB
(since that record is removed from the backend tables). This is
an area that needs more thought in order to keep it simple, but I
think is definately doable.
I've done this already in a master/slave relationship. Here's what
you do:
1. use the PKs to determine new additions. That is, do an outer join
and find the records that don't exist on the other end.
2. use DAO to walk the recordset fields to find updated records.
This is a matter of writing on-the-fly SQL that is something like
this:
UPDATE table1
INNER JOIN table1 ON table1.PK = table2.PK
SET table1.field = table2.field2
WHERE table1.field <> table2.field2
Now, it's more complicated than that, as you have to account for
Nulls. And you can't use the same WHERE clause for all data types.
What I usually do is have a CASE SELECT on the field type that
writes an appropriate WHERE clause.
Then, to run it, you walk the fields collection of a recordset with
the updatable fields, and then write the SQL for each field. This
SQL is *very* efficient and runs very quickly, and it only updates
the fields that have been changed.
3. deletes are harder, because you can't delete a record that isn't
there. There are two approaches for this:
A. use a DELETED flag and a DELETED field. The flag means "a user
deleted this" and the DELETED field (a date/time field) says when
it was deleted from the other replica. Now, the problem here is
that you still need to get it around to all the other replicas,
so your topology is going to matter a lot. You won't be able to
synch with just any other database -- you'll only be able to have
pairs of databases. I've never done this except in a master/slave
relationship between two databases, so I didn't have to worry
about that, and don't know what the ideal solution would be.
B. do it like Jet replication and maintain a table of deletions.
Then all you'd be doing is checking for additions to that table
(just like in #1) and then processing the new records
accordingly. This seems to me to be much easier than A) in a
multi-database scenario like yours.
But all of these are dependent on generating unique PKs in each
database that are the permanent PKs.
4) My app will look to add records to the temp updateDB when the
app closes (deletions would presumably already have been added).
Once the temp updateDB is filled with the updated/modified/deleted
data and the Yes/No flag set for each record, then the app will
copy the temp updateDB from the local workstation to a common
server location that all workstations can see.
Your temp database is just your method of sending the data, so I
guess that works. I've always had direct connections when I did this
or used text files (the first time I implemented this was between an
Access MDB and a MySQL database on a web server, the second was
between an Access MDB and another MDB downloaded from a web site;
notice a pattern? :).
So, the idea is that the temp updateDB created by the app on local
workstations should in the vast majority of cases be relatively
small and easily/quickly copied to the common server folder. It
will be named with the user's name and a date (down to the
second). BTW: The reason for using a model DB is that it should
be quicker to simply create an instance of a model database,
rather than creating an entire temporary DB from code (besides
I've got a lot of other coding to do!).
Tony Toews doesn't believe that to be true. He has code on his
website to create temp databases on the fly and he says it's just as
fast, because a Jet MDB created in code doensn't include anything
but the bare-bones properties needed to store the data.
I just don't want to spend the time writing and testing the code to
create the temp file! So I use the method you're proposing, but it's
not because of how long it takes to create the temp database from
scratch.
I already do it this way for my normal temporary database that my
app
uses. In two years, this techniques has always worked as intended
and never failed.
It's exactly what I use for temp files.
5) When my app starts up, it will look for temp updateDBs in a
folder on the server. If it finds any that have not already been
processed (based on prior logs in a frontend table), it will copy
them from the server to the workstation. Then code in the app
will transfer "update/add" data from each table in the temp
updateDB(s), copied over from the server, to the equivalent local
backend table. If the flag for the record in the table in the
temp updateDB is set to 'Yes' that record will be deleted from the
local backend. As noted, my app will contain a front-end table
that will log the filename of each temp updateDB that was imported
as well as the import date.
I would suggest you'll want to archive both the temp databases and
the state of the real database before applying the changes from the
temp databases. This is so you can easily roll back to a previous
state if something goes wrong.
You'll want to do all your synching wrapped in transactions, too.
So, I think this can work for updating workstations. It's an
indirect synchronization scheme because the temp updateDB and the
local back-end file are only opened locally for updating of the
local back-end file.
It's a direct analog of how Jet indirect synch works with dropbox
files.
Scheduling is not needed.
What about between the servers?
When the app opens it will look for temp
updateDB(s) to IMPORT and when it closes it will look to EXPORT
any table changes made by that user into a temp updateDB that will
be copied to the server. Again synchronizing table design changes
is not critical. There are other ways to handle this. If we
change the table design, then we will do it to the most current
copy we can find and run code to take all temp updateDB files on
the server and import them. Most of these updates will likely fail
due to key violations. My app launcher already looks for updates
to my app located on the server and will download them to the
workstation. It is a simple matter to also force downloading of
fresh copies of the backend from the server. Further, in this
scenario, even if the supposedly updated backend on the server
does not truly have all current data, a new copy of my app will
have an empty frontend table that logged previous imports of temp
updateDBs. So it will try again to import temp updateDBs once it
is on the workstation. Most if not all will fail due to key
violations, but it should ensure that the new backend file is
updated.
Sounds like there are pitfalls there, but the basic outline sounds
about right.
HOWEVER, I still need to have a current copy of the backend on
the server. My app launcher allows users to download a copy of
the backend from the server (in case it ever gets corrupted,or if
we've changed the table designs, or if they haven't logged on in
30 to 60 days).
But I'm unclear on how to keep the copy of the backend on the
server up-to-date. I guess I will need to have one or more of the
workstations handle this. My app already has a user setup form
that defines the group a user belongs to as well as what
permissions they have to see data. Only administrators can
add/delete users and set permissions. I could add a checkbox to
that user setup form that sets whether their copy of my app can
update the server backend. Only users that are within the LAN of
the SERVER would be allowed to do this; no users connecting to the
server via a WAN would be allowed to do this. When these users
close my app, the code would not only update their local copy but
also the server copy. An alternative would be to create a program
that would schedule these updates.
Yes, you'll have to schedule somehow. I know that it's possible to
launch an Access MDB that runs code automatically, and this works
when there's not a user logged on (I had a client whose daily
compacts were done this way).
I already have a small
program for scheduled backup and compaction of the server backend
so I could add updating of the server backend tables to this
program. It has a timer (IT won't allow us to use Windows
Scheduler).
Assholes.
You could write a VBScript to do the same thing, you know, using
DAO. Would they let you schedule that?
This might
actually be the best solution. Users on the LAN where the server
is could just launch this small program, make sure the timer is
set and then leave their machines on over night.
It could work, but I'd be worried about it -- it's too easy for it
to get shut down and not run.
Also "housekeeping" is an issue. At what point do we start
getting rid of these temp updateDB files that are accumulating not
only in the server folder but also in the workstation folders.
Since the frontend will have a table that logs every temp updateDB
filename and a date of import, any entries where the date of
import is over 60 days old, for example, could be deleted from not
only the table but also the folder.
What about zipping them up? That gives you a lot more overhead?
Also, you might want to consider not updating every time, and just
updating when a certain number of changes have accumulated.
These are just my initial thoughts and I would appreciate
comments, suggestions, etc.
It's a *very* hard task, and you're running into some of the same
problems with IT, because you really need scheduling to remove as
much latency as possible. But I think it's doable.
But I don't envy you the task.
If I were in your shoes, I'd be putting my energy into lobbying
management into getting IT to approve use of indirect
synchronization.
--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/
.
- Follow-Ups:
- Re: Homegrown synchronization
- From: rdemyan
- Re: Homegrown synchronization
- References:
- Homegrown synchronization
- From: rdemyan
- Homegrown synchronization
- Prev by Date: Re: Indirect synchronization setup with no synchronizers on servers
- Next by Date: Re: Homegrown synchronization
- Previous by thread: Homegrown synchronization
- Next by thread: Re: Homegrown synchronization
- Index(es):