Re: .docx files
- From: Phillip Jones <pjones1@xxxxxxxxxxxx>
- Date: Mon, 21 May 2007 14:19:03 -0400
John McGhie wrote:
Hi Joe:
OK, as I told you directly, I dug around in your document with SimpleText.
It appears that it was produced using the Open Document Format converter
from Source Forge.
Which means that while the document is in XML (it might even be "valid"
XML), Microsoft Word does not have the style sheets to read it with, so to
Word it is a corrupt document.
Joe's documents are NOT "Word documents", which is why the Converter can't
read them. Neither can Word 2007 :-)
More information:
Let's do "XML 101" because I will bet this is not the last we see of this
problem in here :-) Better yet, let's start with "Computing 101", because
that's where this problem has its roots.
Microsoft is a software "company". The word "company" implies it has some
shareholders, who have invested their pension funds in the corporation and
are hoping that it will make a PROFIT :-)
Microsoft has the best word processor on the planet, and it makes a profit
if it can "sell" it. Several other software companies would like to weaken
Microsoft, so they have less competition for things such as databases and
file servers. One way to do that is to wreck Microsoft's cash-cow, which is
Office. These companies have banded together to try to produce an
"alternative" to Microsoft Office, which they are basically willing to give
away for free. This doesn't hurt them, because they do not sell a competing
product anyway, but it does hurt Microsoft, which makes half its money from
Office :-)
The competitors don't want to get sued, so their version of Office produces
"Open Document Format". Very similar to Microsoft's "Open XML format, but
sufficiently different that they won't get sued.
Microsoft sees this coming, and produces Office without the ability to read
or write Open Doc format. This is a war, and in war, it's generally
considered good practice to wait for the enemy to shoot you, rather than
shooting yourself. So neither Word nor the Mac Converter can read Open Doc.
The open source community has a project under way (with Microsoft
assistance) to produce a translator that can translate a document between
Microsoft Open XML and Open Doc. And that's what I think was used to
produce the documents that Joe can't open.
Now let's do XML 101...
Extensible Markup Language is one of a group of languages that are all based
on Standardised Generalised Markup Language. SGML was invented in the 60's
by IBM. HTML is one application of it, and that's what we use to produce
websites.
HTML has a long list of "tags" that have defined names and defined meanings.
An HTML file can have a style ***, but it's not so important, because all
the tags have fixed names and the browser can simply assume the names all
mean what they are supposed to mean and display them accordingly.
The problem with that is that if you wish to put something on your web page
that is not in the standard Document Type Definition for HTML, you are outta
luck. The browser will not be able to understand whatever code you use, so
will either ignore it or crash or both.
XML is the answer: You can extend XML to describe anything you like,
provided you include a style *** to tell the recipient what the names you
have used are, and what they mean.
A .docx file is a zipped container. Inside is a little website, with files
and folders. The standard specifies a structure for these. You do not have
to have all of the items, but if you have any, they must have the correct
structure. You can have extras, but you have to tell the recipient what
they are, and what's in them.
You can use any style*** you like, and have as many different tags as you
like, but you must say what each one is and what it means. The style***
must be inside the .docx, or it must be at a particular URL and the computer
must be on the Internet to get it.
Without the style***, the document is totally unintelligible to the
recipient. Complete Swahili -- it has no idea which of the characters are
tags, let alone what the tags mean :-)
Of course, it is up to the maker of the XML file to ensure his recipients
can not only get the style***, but that their computers can process the
commands within it. For example: There's no point in sending commands for
right-to-left text to Mac Word, because it can't display that.
I believe Joe's first problem is that the style*** that applies to those
documents is in an untrusted location, so Microsoft software is going to
refuse to download it.
Without the style***, neither Word nor the Converter can even FIND the
"content" of the document, let alone read it.
Both Microsoft and the Open Source community are working on that problem.
The second problem is that once you can understand the language inside the
document, it does not necessarily follow that you can carry out the commands
it contains. Microsoft and the Open Source community are also working on
that problem.
I expect Microsoft Word will be able to handle ODF documents in the near
future, if the user installs a translator to convert from one to the other.
And when that happens, Microsoft's Macintosh Business Unit will have to
determine whether it's profitable for it to spend time and money on bringing
that ability to the Mac. If they decide to do that, they must then decide
whether there is enough demand for the ODF converter for them to enable the
Converter to handle ODF in earlier versions of Microsoft Office.
Microsoft's shareholders might want to be in that discussion, because if
they decide to do that, it's a bit like handing the keys to the Microsoft
cash register to the Open Source Community.
And WE might want to have some input to that decision too. Why would
Microsoft keep making Word if its competitors were giving away an
equivalent? Makes the real thing a bit difficult to "sell". You recall
what happened to Internet Explorer for Mac?
Oops... That's "Politics 101", isn't it :-)
No That's strictly Bill Gates response to a hissy Fit he had, when Apple decided to include Safari in OSX. Safari which is just a patched up version of an abandoned Web Browser from The UNIX/Linux Community. (Because it been patched some many times they just abandoned it).
I've had safari on my Mac's since OSX.2.3 and I've opened it 5 times in all that time.
Cheers
On 21/5/07 9:22 AM, in article
1179703325.934637.49110@xxxxxxxxxxxxxxxxxxxxxxxxxxx, "jsafdie@xxxxxxxxxx"
<jsafdie@xxxxxxxxxx> wrote:
John, thanks for this -- I didn't think I was losing my mind, but
after everything suggested didn't work, I was beginning to wonder.
Care to look at one more (the only other one, thankfully, that was
submitted to me as a .docx)? I can't do anything with it either, and
it would be good to know that it was corrupted as well.
Thanks for your time, and sorry to have bothered everyone with this,
which turned out not to be a technical problem at all!!
Joe
On May 20, 3:20 pm, John McGhie <j...@xxxxxxxxxxx> wrote:Hi Joe:
Ahhhh.... I see your problem :-)
1) The file is corrupt. Nothing will open it, including Word 2007 in
Vista.
2) Examine the file name carefully and you will see it has two extensions,
one after the other: .docx.zip. What you need to do is "remove" the second
extension. You do not unzip it, you simply change the name of the file to
delete the .zip and leave the .docx as the last extension.
3) The name is full of %20 -- this is supposed to be a "space" for web
browsers. But the % character can cause problems. The should have replaced
each "%20" with a space.
Even if you do all of that, the thing still won't open, in either Word 2007
or the Converter. It is very rare to see a document that bad: Word 2007
can't even recover the unformatted text from it, so it's really pooched.
Sorry: You need to get back to the source and say "A Microsoft MVP and
Microsoft Word Consultant says this document is so corrupt Word 2007 can't
open it. Please fix."
Cheers
On 21/5/07 3:59 AM, in article
1179683994.881031.66...@xxxxxxxxxxxxxxxxxxxxxxxxxxxx, "jsaf...@xxxxxxxxxx"
<jsaf...@xxxxxxxxxx> wrote:Daiya ----
Yes! It's Blackboard (formerly Web CT Vista); thanks so much for
intuiting that. And although I'll go back and try what John and Philip
have said in the past few posts, I think that your advice will
ultimately do the trick . . . IF, of course, I can figure out where in
Blackboard that option you mention is! I'll dig around for that . . .
(right now, as I hope I've made clear, students attach their files to
me via e-mail or some other Assignment Tool, and when I click on the
link for their attachment, it shows up immediately on my desktop --
or, as Philip and John say above, some other folder I create for the
purpose -- and yet when I drag that file into the converter, it
disappears and no .rtf file takes its place).
Thanks to all . . . it would be great if I could figure this out!
Joe
On May 20, 10:34 am, Daiya Mitchell <daiyaNOS...@xxxxxxxxxxxxxxxx>
wrote:
Wild guess to deal with the downloading issue--is this Blackboard or
somesuch? It has an option where it packages all submitted files into a
folder, which it then zips and lets you download, rather than clicking
on the links one-by-one. I would assume that any Course Management
Software has a similar option. If you can get it to do that, then Safari
won't have to deal with the docx at all, and bypassing might get you a
docx file you can use with the converter.
If this is relevant, it will probably only make sense to Jsafdie. If it
makes no sense to you, please ignore it.
Daiya
PS. *how* did you get your students to spell out "twentieth century"
instead of writing "20th century"?
Don't wait for your answer, click here:http://www.word.mvps.org/
Please reply in the group. Please do NOT email me unless I ask you to.
John McGhie, Consultant Technical Writer
McGhie Information Engineering Pty Ltdhttp://jgmcghie.fastmail.com.au/
Sydney, Australia. S33°53'34.20 E151°14'54.50
+61 4 1209 1410, mailto:j...@xxxxxxxxxxx
--
------------------------------------------------------------------------
Phillip M. Jones, CET |LIFE MEMBER: VPEA ETA-I, NESDA, ISCET, Sterling
616 Liberty Street |Who's Who. PHONE:276-632-5045, FAX:276-632-0868
Martinsville Va 24112 |pjones@xxxxxxxxxxxx, ICQ11269732, AIM pjonescet
------------------------------------------------------------------------
If it's "fixed", don't "break it"!
mailto:pjones@xxxxxxxxxxxx
<http://www.kimbanet.com/~pjones/default.htm>
<http://www.kimbanet.com/~pjones/90th_Birthday/index.htm>
<http://www.kimbanet.com/~pjones/Fulcher/default.html>
<http://www.kimbanet.com/~pjones/Harris/default.htm>
<http://www.kimbanet.com/~pjones/Jones/default.htm>
<http://www.vpea.org>
.
- References:
- .docx files
- From: jsafdie
- Re: .docx files
- From: JE McGimpsey
- Re: .docx files
- From: jsafdie
- Re: .docx files
- From: JE McGimpsey
- Re: .docx files
- From: jsafdie
- Re: .docx files
- From: John McGhie
- Re: .docx files
- From: Phillip Jones
- Re: .docx files
- From: John McGhie
- Re: .docx files
- From: jsafdie
- Re: .docx files
- From: Daiya Mitchell
- Re: .docx files
- From: jsafdie
- Re: .docx files
- From: John McGhie
- Re: .docx files
- From: jsafdie
- Re: .docx files
- From: John McGhie
- .docx files
- Prev by Date: Re: Printing an outline
- Next by Date: Re: Printing an outline
- Previous by thread: Re: .docx files
- Next by thread: Re: .docx files
- Index(es):