Mozilla needs to download files more intelligently

Parsap · Post by **Parsap** » February 1st, 2003, 11:14 pm

When you click on a link to a file in Mozilla (a .avi, .wmv, or .rar, for example) sometimes Mozilla displays a bunch of gibberish on your page rather than downloading the file correctly. These download errors happen annoyingly often in Mozilla but hardly affect other browsers. IE, Safari, Opera, OmniWeb, etc. don't suffer from this problem.

IMHO, this is the most annoying aspect of Mozilla. Sadly, according to #mozillazine, the Mozilla developers refuse to address the issue. Apparently the reason why many downloads fail is because the server isn't sending the right metadata. This huge flaw in Mozilla relative to other browsers isn't technically their fault and hence they won't fix it.

That's stupid. Since when has Mozilla adopted this anal retentive standards compliancy even at the expense of usability? It's certainly not present in the HTML rendering code. There's plenty of "quirk modes" that don't follow the W3C guidelines but are there for the sake of compatibility. Why should downloading be any different?

It's a simple problem with a simple solution. Mozilla screws up downloads annoyingly often while most other browsers don't. I don't care if IE's methods or Safari's methods aren't by the book. The fact is that they work.

You should always weigh the benefits and consequences of your actions when programming. In this case, Mozilla's developers were smoking crack.

The downloading method used by 95% of surfers:
PROS: 99.99% of your downloads will work as expected.
CONS: Uh, none? Unless you enjoy reading MP3s, that is.

Mozilla's method of relying solely on metadata:
PROS: Someday, when you're least expecting it, you might click on a .rar file that some smart-ass server admin decided to rename to a .wmv but set the mimetype of .wmv to .rar. You have to prepared! </sarcasm>
CONS: An aggravatingly large portion of downloads show up as 32JHJ6754J64JK6HJ243H3BJK5BJ.

I rest my case. Mozilla needs a preferences option to mimic the downloading method that 95% of other surfers use.

bzbarsky · Post by **bzbarsky** » February 2nd, 2003, 12:49 am

Parsap wrote:This huge flaw in Mozilla relative to other browsers isn't technically their fault and hence they won't fix it.

Not only is it not Mozilla's fault, but if we did what IE does some perfectly correct pages would _break_ because our guess at the type would be incorrect (I have had several plaintext documents attempt to show up as HTML in IE, usually requiring me to rewrite chunks of the doc to get it to stop fucking it up. Maybe they have improved IE's sniffing since--I've not used it after 5.0).

Parsap wrote:Since when has Mozilla adopted this anal retentive standards compliancy even at the expense of usability?

Since the founding of the project, approximately.

Parsap wrote:Why should downloading be any different?

Well.... Quirks mode is based on detecting that the web page was authored a long time ago, basically. And dealing appropriately, rendering it the way it would have gotten rendered back then.

There is no good way to tell that a server admin has not updated their MIME setup for ages. So there is no way to know when applying the broken "quirks" mode is relatively safe (unlike with HTML).

Parsap wrote:It's a simple problem with a simple solution.

Yes. That solution is to fix the server. To that end, a patch has been submitted to Apache to keep it from sending a type of text/plain when it does not know what the type is. Once we know the server has no idea what the type is (due to the lack of a content-type header), we will of course try to detect the type ourselves.

Parsap wrote:You should always weigh the benefits and consequences of your actions when programming. In this case, Mozilla's developers were smoking crack.

Always nice to have a nice mature discussion with someone....

Parsap wrote:Mozilla's method of relying solely on metadata:
PROS: Someday, when you're least expecting it, you might click on a .rar

Someday you may have to deal with extensions that are mapped to multiple file types. Last I tried IE on a RPM file that was sent as application/x-rpm (instead of the realaudio type), IE fucked it up badly (fed it to RealPlayer, and without asking for confirmation too).

In this case it's a decision between being backwards compatible and forwards compatible. I would rather be forwards compatible any day, so that two years from now web developers are not going around cursing how broken the browser I wrote is (as they do now about NS4 and IE4, as they are starting to do about IE5 and as they will do about IE6, I predict).

Now adding a UI option to take what's currently rendering in the content area and attempt to redect the type... that may be doable (it would be a lot of work unless we assume that the data is in cache, but....)

Bowser · Post by **Bowser** » February 2nd, 2003, 1:04 am

Yes, I'll be loading quicktime movies and then it will just stop and I have to reload and do it all over again. And I have dail up!

bzbarsky · Post by **bzbarsky** » February 2nd, 2003, 1:25 am

Bowser wrote:Yes, I'll be loading quicktime movies and then it will just stop and I have to reload and do it all over again. And I have dail up!

Yeah, that would be bad. Which is why I did _not_ suggest that we do that. Note that I said "redetect the type", not "redownload the data".

Also notice what I said about cache. This would be _trivial_ if we were willing to redownload the data; the whole point is to not do that.

bim · Post by **bim** » February 2nd, 2003, 8:52 am

Netscape does seem to use other means. I was just downloading Komodo 2.0.1 from ActiveState and with Mozilla it wanted to save it as a .txt while Netscape 7.01 automaticly sugested the correct extention (.msi).

Not sure if this can be related, but the file I downloaded with Mozilla was broken. That's why I tried downloading it with Netscape and now Komodo did install...

Parsap · Post by **Parsap** » February 2nd, 2003, 11:38 am

Not only is it not Mozilla's fault, but if we did what IE does some perfectly correct pages would _break_ because our guess at the type would be incorrect (I have had several plaintext documents attempt to show up as HTML in IE, usually requiring me to rewrite chunks of the doc to get it to stop fucking it up. Maybe they have improved IE's sniffing since--I've not used it after 5.0).

Right. A minute amount of pages will break. Guess what? A huge amount of pages will start working. Furthermore, it will be an option. If you honestly enjoy having tons of downloads display as plaintext when obviously they were meant to be downloaded, turn it off. You can live in your ideal world, the rest of Mozilla's user base will live in the real one.

Since when has Mozilla adopted this anal retentive standards compliancy even at the expense of usability?

Since the founding of the project, approximately.

Bullshit. Take a look at the preferences. You can change Mozilla's behavior for tons of stuff. Hell, I could make Mozillazine white on black if I wanted, "ignoring the background and color specified." I simply want an option to "ignore the mimetype specified" and download it like IE does. (Actually, IE takes the mimetype and other items into account when intelligently deciding what to do.)

Well.... Quirks mode is based on detecting that the web page was authored a long time ago, basically. And dealing appropriately, rendering it the way it would have gotten rendered back then.

In other words, it detects that obviously if Mozilla applied its anal retentive standards to the page, it would look like crap, and it exempts it from its scrutinizing parsing. How is that any different from what IE, Netscape, etc. does when downloading files?

There is no good way to tell that a server admin has not updated their MIME setup for ages. So there is no way to know when applying the broken "quirks" mode is relatively safe (unlike with HTML).

Yes there is. Look at how IE, Netscape, etc. does it. It's not perfect, but it has a much, much better track record than Mozilla.

Yes. That solution is to fix the server. To that end, a patch has been submitted to Apache to keep it from sending a type of text/plain when it does not know what the type is. Once we know the server has no idea what the type is (due to the lack of a content-type header), we will of course try to detect the type ourselves.

Heh. So instead of conforming Mozilla to the world, you want the world to conform to Mozilla. Not the best idea when Mozilla is struggling with a 5% user base.

No. The solution is to add an option for realists.

You should always weigh the benefits and consequences of your actions when programming. In this case, Mozilla's developers were smoking crack.

Always nice to have a nice mature discussion with someone....

Always a pleasure to have a discussion with someone born without a sense of rhetoric or humor.

In this case it's a decision between being backwards compatible and forwards compatible. I would rather be forwards compatible any day, so that two years from now web developers are not going around cursing how broken the browser I wrote is (as they do now about NS4 and IE4, as they are starting to do about IE5 and as they will do about IE6, I predict).

Uh, why not make it compatible with what is out there now, and then in two years, if your method actually makes sense, switch it back?

Now adding a UI option to take what's currently rendering in the content area and attempt to redect the type... that may be doable (it would be a lot of work unless we assume that the data is in cache, but....)

Right. An option is what I've been going for since the beginning. I simply want an option for Mozilla to do whatever IE, Safari, Opera, Netscape, etc. are doing. It clearly works well.

bzbarsky · Post by **bzbarsky** » February 2nd, 2003, 12:09 pm

Parsap wrote:Right. A minute amount of pages will break. Guess what? A huge amount of pages will start working.

By that argument, why bother implementing any standards at all? Why not "do whatever IE does"? Lots more pages would work then....

For that matter, why not just use IE? Why bother writing a different browser? It clearly works so much better, no? (This is a serious question, not a rhetorical one. I would dealy like to know your motivations for using Mozilla, so that I can evaluate which parts of the project philosophy you do or do not care about.)

Parsap wrote:I simply want an option to "ignore the mimetype specified" and download it like IE does. (Actually, IE takes the mimetype and other items into account when intelligently deciding what to do.)

Yes, yes. IE's behavior is fairly well documented. ;) An option like that would certainly be a possibility. Putting it in without putting a good type detection mechanism would be inadvisable, but I've been improving the type detection slowly (needed for file:// and ftp:// no matter what) and at some point this may become a viable solution.

Parsap wrote:How is that any different from what IE, Netscape, etc. does when downloading files?

Yes there is. Look at how IE, Netscape, etc. does it. It's not perfect, but it has a much, much better track record than Mozilla.

Well, IE applies its sniffing NO MATTER WHAT, even when it's obviously wrong. We do _not_ apply quirks mode to obviously standards compliant (has a correct, modern doctype) pages.

As for Netscape.... I have to say that I did some extensive testing of various browsers (on Linux, mostly, but somewhat on Windows) for precisely the behavior we are talking about. At the time, Mozilla, Netscape 4, Opera (6 for Linux, iirc), lynx, links, w3m, amaya all had the same behavior -- they followed the server specified type. I was unable to test Konqueror, unfortunately. IE, of course, did what IE does. So I'm not sure why you keep mentioning Netscape -- in my experience every single version of Netscape behaves as Mozilla does.

Parsap wrote:Heh. So instead of conforming Mozilla to the world, you want the world to conform to Mozilla. Not the best idea when Mozilla is struggling with a 5% user base.

Actually, no. Instead of conforming Mozilla to the world, we are attempting to convince everyone to conform to an existing clear set of rules such that any two systems conforming to those rules will work with each other. This is the whole point of "standards" -- the goal is not purity for the sake of purity, but compliance for the sake of interoperability. The alternative is what we have already seen with tag soup, where half the browser code consists of workarounds for bugs in dozens of servers and thousands of pages, and half (or more) of the web page page code consists workarounds for bugs in browsers. The situation rapidly becomes untenable (you should consider perusing the HTML parser source and the table layout source to see what I mean).

Parsap wrote:Always a pleasure to have a discussion with someone born without a sense of rhetoric or humor.

I have both. I just happen to think that insults make for poor rhetoric and lousy humor.

Further, rhetoric is somewhat out of place when you're having a technical discussion (which is what I think we're having here). If I'm mistaken, and this is not supposed to be a discussion about how to solve a problem but rather a way for you to hear yourself talk, please forgive me and continue your calls to action in majestic solitude.

Parsap wrote:Uh, why not make it compatible with what is out there now, and then in two years, if your method actually makes sense, switch it back?

Because then people will still be using the old broken version! Again, look at NS4 and IE4/5. Many things have been fixed in IE6, and still web page authors have to struggle with the horrible CSS bugs of IE5, putting in weird hacks and attempting to hide stylesheets from it, cursing it the whole while.

Again, I'm not completely against adding options to, at the user's discretion, ignore server-provided types in some cases. We need a good UI for this (a blanket option would perhaps work, but I would rather not do that because that will break _other_ content) and we need the technical means of doing good MIME type detection (being worked on).

JLP · Post by **JLP** » February 2nd, 2003, 1:05 pm

If files do not have the right MIME type set then do what I do. Just send a polite email to webmaster of the site and tell him that his server is configured incorrectly. I was able to fix problems on quite some pages this way. Not only does it work in all browsers then, it is also the only correct way of solving this problem.

Yui · Post by **Yui** » February 2nd, 2003, 6:48 pm

bzbarsky wrote:
Parsap wrote:This huge flaw in Mozilla relative to other browsers isn't technically their fault and hence they won't fix it.

Not only is it not Mozilla's fault, but if we did what IE does some perfectly correct pages would _break_ because our guess at the type would be incorrect (I have had several plaintext documents attempt to show up as HTML in IE, usually requiring me to rewrite chunks of the doc to get it to stop fucking it up. Maybe they have improved IE's sniffing since--I've not used it after 5.0).

I have used IE to download probably over a hundred files and IE "sniffs" at 99%. I am rather surprised how there is so much opposition to what appears to be an easy decision to make. Are people using standards to cover up for their laziness in writing the code? What other browser has this problem? It seems that every other browser has found a workable solution *except* mozilla.org. Are the programers at mozilla.org just not good enough?

- Yui

Thumper · Post by **Thumper** » February 2nd, 2003, 8:54 pm

Nice one Yui. I often find that insulting people gets them to do what I want.

If you'd read the thread, you'd find that the -only- browser which implements sniffing to such a level is IE. Personally I think it would be nice to have a little extension to allow sniffing, as a common exploit for skirting around sites which don't allow offsite image linking is to upload an image with a text type (which isn't blocked) and have IE sniff it, find it's an image and display it properly. I certainly wouldn't ask for such a hack to be in the main browser code and I definitely wouldn't question the developers' competence for (rightly) sticking to their guns on the issue.

- Chris

bzbarsky · Post by **bzbarsky** » February 2nd, 2003, 9:06 pm

Yui wrote:I am rather surprised how there is so much opposition to what appears to be an easy decision to make.

Yep, immoral decisions are often easy to make. ;) For example, deciding to steal rather than work for a living is easy. So?

Yui wrote:What other browser has this problem?

Since you obviously cannot read, I will repeat the list for your special benefit: Mozilla, Netscape of all versions up to and including 7, Opera of all versions prior to 7 (I have not tested 7, so can't claim anything about it), Amaya of all versions, lynx of all versions, links of all versions, w3m of all versions.

In fact every single browser I have tested, except for IE. The browsers I have not tested that are currently in use are Opera 7 and khtml-based browsers (Parsap says that Safari does not have the problem, but he also says that Netscape does not, which is blatantly wrong).

Yui wrote:Are the programers at mozilla.org just not good enough?

That must be it. It's not that they have other priorities, or consider this change to be incorrect per se, or anything like that. They must just not be good enough.

Well, if you would like to fix this, feel free to. This is an open project, and anyone is welcome to contribute code, if it's good code. I recently reorganized MIME type sniffing to make adding new detectors absolutely trivial. See http://lxr.mozilla.org/seamonkey/source ... er.cpp#288 and http://lxr.mozilla.org/seamonkey/source ... coder.h#96 for the code; it's quite well documented and should be easy to figure out for someone with your mad skillz.

And if you're not willing to put up either code or constructive ideas, then shut up.

Parsap · Post by **Parsap** » February 3rd, 2003, 12:37 am

By that argument, why bother implementing any standards at all? Why not "do whatever IE does"? Lots more pages would work then....

Well, emulating IE's rendering system in Mozilla would be way too much work for too little benefit. It would be an unimaginable amount of work and I can't remember the last time I saw a page that rendered incorrectly in Mozilla. While it would be nice to see a preferences option "emulate MSIE 6's rendering engine" I probably wouldn't even notice the difference.

On the other hand, I notice Mozilla's download failures all the time. I'm not kidding when I say that this is the number one annoyance in Mozilla for me. Unlike the rendering engine, it would be relatively trivial to make extremely compatible and it would solve a serious problem that actually exists.

For that matter, why not just use IE? Why bother writing a different browser? It clearly works so much better, no? (This is a serious question, not a rhetorical one. I would dealy like to know your motivations for using Mozilla, so that I can evaluate which parts of the project philosophy you do or do not care about.)

The only aspect of IE that is better, IMHO, than Mozilla is this single problem that I'm bitching about. Otherwise I love Mozilla. I use it for tabbed browsing, pop up blocking, image blocking, cool themes, url keywords, and probably some other stuff that IE doesn't have.

Yes, yes. IE's behavior is fairly well documented. ;) An option like that would certainly be a possibility. Putting it in without putting a good type detection mechanism would be inadvisable, but I've been improving the type detection slowly (needed for file:// and ftp:// no matter what) and at some point this may become a viable solution.

That is excellent news! I will be eagerly awaiting.

I'm not sure why you keep mentioning Netscape -- in my experience every single version of Netscape behaves as Mozilla does.

Originally I wasn't mentioning Netscape because I assumed since it used Gecko, it would be like Mozilla. Then I read bim's post where he says: "Netscape does seem to use other means. I was just downloading Komodo 2.0.1 from ActiveState and with Mozilla it wanted to save it as a .txt while Netscape 7.01 automaticly sugested the correct extention (.msi)" and decided to add it to aid my argument.

I was unable to test Konqueror, unfortunately.

Just FYI, I heard that Konquerer uses Mozilla's method from #mozillazine, but analogous Netscape, when Apple applied khtml to their mass market browser, Safari, they opted to go with IE's method.

bzbarsky · Post by **bzbarsky** » February 3rd, 2003, 12:52 am

Parsap wrote:Then I read bim's post where he says: "Netscape does seem to use other means. I was just downloading Komodo 2.0.1 from ActiveState and with Mozilla it wanted to save it as a .txt while Netscape 7.01 automaticly sugested the correct extention (.msi)"

Ah. That's not the same thing as showing it in the content area, however... in both of them it was trying to save, and I bet one has application/octet-stream mapped to the .txt extension (something that could happen accidentally and got fixed recently).

Thanks for the info on khtml. And if you want to work on getting more type detectors working, that would be fantastic. ;)

dtobias · Post by **dtobias** » February 3rd, 2003, 6:31 am

There's an <A HREF="http://www.w3.org/2001/tag/2002/0129-mime">official W3C document</A> that specifically states that MIME second-guessing, of the sort that MSIE does, is against the standards:

<blockquote>
The architecture of the Web depends on applications making dispatching and security decisions for resources based on their Internet Media Types and other MIME headers. It is a serious error for the response body to be inconsistent with the assertions made about it by the MIME headers. Web software SHOULD NOT attempt to recover from such errors by guessing, but SHOULD report the error to the user to allow intelligent corrective action.
</blockquote>

I discuss this issue some more on <A HREF="http://webtips.dan.info/server.html">my page</A>. I give a <A HREF="http://webtips.dan.info/cgi-bin/plaintext.pl">test URL</A> to see if your browser follows MIME types correctly, or second-guesses, and show how this affects the ability to display simple plain text, properly announced by the server. Another site has a <A HREF="http://entropymine.com/jason/testbed/mime/">more thorough testbed</A>.

dtobias · Post by **dtobias** » February 3rd, 2003, 6:35 am

<A HREF="http://ppewww.ph.gla.ac.uk/~flavell/www/content-type.html">Another interesting page</A> on this subject.

Mozilla needs to download files more intelligently

Mozilla needs to download files more intelligently

Re: Mozilla needs to download files more intelligently

Just contact web master

Re: Mozilla needs to download files more intelligently

Re: Mozilla needs to download files more intelligently

MIME second guessing