Files not saved with correct extension in new Mac betas

Discussion about official Mozilla Firefox builds
danb77
Posts: 32
Joined: May 19th, 2004, 7:58 am

Files not saved with correct extension in new Mac betas

Post by danb77 »

I am experiencing the following problem (as posted by Suresh on mozilla.com)

If I click on a pdf file with a name such as

0803.1234

then Firefix 3.0b4 (OS X) does not recognise it as a PDF file. Seems it is seeing the .1234 extension and overruling the mime type? I checked with Safari and it has no problem recognising the file.

This happens with many recent PDFs at

http://arxiv.org/

where their filenaming scheme is as above.

This affects both directly viewing or saving.


For more information see the thread here:

http://support.mozilla.com/tiki-view_fo ... &forumId=1
Last edited by danb77 on April 4th, 2008, 2:38 am, edited 3 times in total.
User avatar
a;skdjfajf;ak
Posts: 17002
Joined: July 10th, 2004, 8:44 am

Post by a;skdjfajf;ak »

WFM, Vista HP - downloads .pdf just fine.

Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9pre) Gecko/2008040116 Minefield/3.0pre Firefox/3.0 ID:2008040116
User avatar
code65536
Posts: 59
Joined: October 17th, 2004, 7:01 pm
Location: .us
Contact:

Post by code65536 »

I picked a random PDF link on that site, clicked, and watched the HTTP headers. This is what I saw:

Code: Select all

Location: http://arxiv.org/PS_cache/arxiv/pdf/0803/0803.4339v1.pdf


So the 0803.4339 isn't a file at all; it's just a redirect to the real file--and one that has the proper PDF extension, too. So if there is a problem, it's not a matter of Minefield ignoring the MIME type, but a matter of Minefield forgetting that it had been redirected.

However, on Windows (latest nightly build), Minefield brings up the save dialog with the proper extension. So maybe this problem has already been fixed or maybe it's a Mac-only issue...
danb77
Posts: 32
Joined: May 19th, 2004, 7:58 am

Post by danb77 »

I think it is Mac only. I have certainly only found references to it on the web from mac users.

Regarding the redirect, Firefox *is* redirecting ok, as the correct file is saved, but it is saved with the incorrect extension.

If it is useful, here is the header which is sent (according to the HTTP viewer site)

Receiving Header:
HTTP/1.1·200·OK(CR)(LF)
Date:·Mon,·31·Mar·2008·13:27:21·GMT(CR)(LF)
Server:·Apache(CR)(LF)
Set-Cookie:·browser=rexswain.com.1206970041813095;·path=/;·max-age=946080000;·domain=.arxiv.org(CR)(LF)
Last-Modified:·Mon,·04·Feb·2008·17:29:46·GMT(CR)(LF)
ETag:·"12e0061-120545-4455878627680"(CR)(LF)
Accept-Ranges:·bytes(CR)(LF)
Content-Length:·1180997(CR)(LF)
Connection:·close(CR)(LF)
Content-Type:·application/pdf(CR)(LF)
(CR)(LF)
End of Header (Length = 350)
danb77
Posts: 32
Joined: May 19th, 2004, 7:58 am

Post by danb77 »

The problem is still there in b5 (RC).

I thought a work around would be to use FlashGot to send the download to a download manager, but even then, the download manager is not picking up the file extension even though the mime type is specified..
User avatar
Bluefang
Posts: 7857
Joined: August 10th, 2005, 2:55 pm
Location: Vermont
Contact:

Post by Bluefang »

The MIME-Type may have been specified, but the headers don't specify a file name. I believe the proper method would be to specify the file name header in the redirection file.

This 'problem' also exists on Linux. I believe the problem doesn't exist on Windows is because Firefox manually adds the extension based on MIME-Type because the Windows shell relies on it to function properly. *NIX and probably Mac use file MIME-Type to execute files so extensions are merely a human convince.

Is this proper behavior? Probably not. Firefox downloads should pay attention to redirects. However changing the file extension (if it's not specified) based on MIME-Type is most likely wrong behavior for systems other than Windows.
There have always been ghosts in the machine... random segments of code that have grouped together to form unexpected protocols. Unanticipated, these free radicals engender questions of free will, creativity, and even the nature of what we might call the soul...
User avatar
kliu0x52
Posts: 569
Joined: October 18th, 2006, 2:23 pm
Location: .us
Contact:

Post by kliu0x52 »

Bluefang wrote:Firefox manually adds the extension based on MIME-Type

*tries something*

On Windows, trying to download "0804.0411" will get a file save dialog that says "0804.0411v1.pdf". If Firefox was just appending a .pdf from the MIME type as you suggest, then it'd be "0804.0411.pdf", not "0804.0411v1.pdf", which just so happens to be the filename of the redirected file. So the suggestion earlier that the problem is not a MIME type problem but a problem with Firefox mishandling redirects is correct.

The proper behavior should always be to use the filename of the redirected file, especially since some places may direct users to download "foobar-latest.exe" that then redirects to "foobar-1.2.3.exe". AFAICT, Firefox on Windows has always used the redirected filename and not the original when saving.

It does seem strange to me, though, that this is a problem only on non-Windows platforms; it doesn't seem that this is something that should be platform-dependent...

One of you ought to file a bug about this.

danb77 wrote:Regarding the redirect, Firefox *is* redirecting ok, as the correct file is saved, but it is saved with the incorrect extension.

By "Minefield forgetting that it had been redirected", it just means that Minefield is forgetting about the redirect when picking a filename (obviously, it hadn't forgotten about it when retrieving the actual data, or else you'd getting a 0-byte file since redirection responses have no body, which would be a much more serious problem! ;)).
My addons: NoRedirect | QuickDrag | URL Flipper | TabSubmit
Developers: Make sure to test your addons for RTL compatibility!
User avatar
Bluefang
Posts: 7857
Joined: August 10th, 2005, 2:55 pm
Location: Vermont
Contact:

Post by Bluefang »

This might have actually just been fixed.

https://bugzilla.mozilla.org/show_bug.cgi?id=299372

That looks like what this problem is.
There have always been ghosts in the machine... random segments of code that have grouped together to form unexpected protocols. Unanticipated, these free radicals engender questions of free will, creativity, and even the nature of what we might call the soul...
User avatar
kliu0x52
Posts: 569
Joined: October 18th, 2006, 2:23 pm
Location: .us
Contact:

Post by kliu0x52 »

Bluefang wrote:This might have actually just been fixed.

https://bugzilla.mozilla.org/show_bug.cgi?id=299372

That looks like what this problem is.

Not sure about that... Content-Disposition clarifies the contents of the HTTP response body. It is often used when a script is returning an attachment (which is the case that this bug seems to try to address). For example, if you call a script name "foo", and it returns a file named "bar.ext" in its response body, the script will have to indicate that there is a file in its body, what file name it should have, and what the content type is. And this is where Content-Disposition comes into play. One request, one response.

In the problem that we have in this thread, a script named "foo" is called, and it returns a file named "bar.ext", but not in its response body. Instead, the script returns an empty response body and uses the Location field in its response header to redirect the browser to another URL represents the file. The browser then makes a second request for the new URL to get the file, which is then downloaded normally. Two requests, two responses.

I haven't looked over the patch very carefully, but so maybe it might fix this problem as a side-effect, but I suspect that it will most likely have no effect on this.
My addons: NoRedirect | QuickDrag | URL Flipper | TabSubmit
Developers: Make sure to test your addons for RTL compatibility!
User avatar
Bluefang
Posts: 7857
Joined: August 10th, 2005, 2:55 pm
Location: Vermont
Contact:

Post by Bluefang »

I know what the difference is, but there was relevant discussion in that bug. Though, admittedly I didn't read it too close.

However, looking for other related bugs, there are some from back in 2002 which were WONTFIXed because of the way the filechooser worked (it got displayed before any content was downloaded). More recent ones from 2007 are marked as DUPE of the bug I posted.

I'll file a new bug can be filed referencing 299372 for doing the same with Location instead of Content-Disposition.
User avatar
Bluefang
Posts: 7857
Joined: August 10th, 2005, 2:55 pm
Location: Vermont
Contact:

Post by Bluefang »

Never mind on that. I've gotten my self even more confused. When writing up th bug, I was trying to get a few test cases, but I was unable to reliably reproduce the expected results.

I used the following site ( http://www.rexswain.com/httpview.html ) to check the headers on the URL http://arxiv.org/pdf/0801.0002
I did this using both Linux and Windows

Windows wrote:Parameters:
URL = http://arxiv.org/pdf/0801.0002
UAG = Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9pre) Gecko/2008040305 Minefield/3.0pre
AEN =
REQ = HEAD ; VER = 1.1 ; FMT = AUTO

Sending request:
HEAD /pdf/0801.0002 HTTP/1.1
Host: arxiv.org
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9pre) Gecko/2008040305 Minefield/3.0pre
Connection: close

• Finding host IP address...
• Host IP address = 128.84.158.114
• Finding TCP protocol...
• Binding to local socket...
• Connecting to host...
• Sending request...
• Waiting for response...

Receiving Header:
HTTP/1.1·302·Found(CR)(LF)
Date:·Thu,·03·Apr·2008·23:37:58·GMT(CR)(LF)
Server:·Apache(CR)(LF)
Location:·http://arxiv.org/PS_cache/arxiv/pdf/0801/0801.0002v1.pdf(CR)(LF)
Connection:·close(CR)(LF)
Content-Type:·text/html;·charset=iso-8859-1(CR)(LF)
(CR)(LF)

End of Header (Length = 207)
• Elapsed time so far: 1 seconds
• Waiting for additional response until connection closes...

Total bytes received = 207
Elapsed time so far: 1 seconds

Done
Total elapsed time: 1 seconds


Linux wrote:Parameters:
URL = http://arxiv.org/pdf/0801.0002
UAG = Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9pre) Gecko/2008040304 Minefield/3.0pre ID:2008040304
AEN =
REQ = HEAD ; VER = 1.1 ; FMT = AUTO

Sending request:
HEAD /pdf/0801.0002 HTTP/1.1
Host: arxiv.org
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9pre) Gecko/2008040304 Minefield/3.0pre ID:2008040304
Connection: close

• Finding host IP address...
• Host IP address = 128.84.158.114
• Finding TCP protocol...
• Binding to local socket...
• Connecting to host...
• Sending request...
• Waiting for response...

Receiving Header:
HTTP/1.1·200·OK(CR)(LF)
Date:·Thu,·03·Apr·2008·23:38:58·GMT(CR)(LF)
Server:·Apache(CR)(LF)
Set-Cookie:·browser=rexswain.com.1207265938957387;·path=/;·max-age=946080000;·domain=.arxiv.org(CR)(LF)
Last-Modified:·Sun,·03·Feb·2008·04:17:25·GMT(CR)(LF)
ETag:·"1dd5810-f0627-4453948e0ff40"(CR)(LF)
Accept-Ranges:·bytes(CR)(LF)
Content-Length:·984615(CR)(LF)
Connection:·close(CR)(LF)
Content-Type:·application/pdf(CR)(LF)
(CR)(LF)

End of Header (Length = 348)
• Elapsed time so far: 0 seconds
• Waiting for additional response until connection closes...

Total bytes received = 348
Elapsed time so far: 0 seconds

Done
Total elapsed time: 0 seconds


So the server is sending different data for the different platforms. So my guess is that on Linux and Mac, http://arxiv.org/pdf/0801.0002 is the actual file, not a redirect. And it doesn't have an extension, which is the source of the problem. I believe it's the site's problem, not Firefoxe's.

Though that doesn't mean Firefox is working perfectly. I also made up this little test case which is a PHP file that just sends a redirect http://beta.bluefang-logic.com/redir_dl/test.html
Right-Click->Save As on the link works properly, but doing so on the image doesn't.
User avatar
kliu0x52
Posts: 569
Joined: October 18th, 2006, 2:23 pm
Location: .us
Contact:

Post by kliu0x52 »

Bluefang wrote:So the server is sending different data for the different platforms. So my guess is that on Linux and Mac, http://arxiv.org/pdf/0801.0002 is the actual file, not a redirect. And it doesn't have an extension, which is the source of the problem. I believe it's the site's problem, not Firefox's.

LOL. Well, I guess that solves our mystery. I was scratching my head about why Firefox was doing different stuff for something that ought to have been platform-independent... it had never occurred to me to blame the website.

Though that doesn't mean Firefox is working perfectly. I also made up this little test case which is a PHP file that just sends a redirect http://beta.bluefang-logic.com/redir_dl/test.html
Right-Click->Save As on the link works properly, but doing so on the image doesn't.

Hmm. Could you file a bug? Both the built-in saveURL and saveImageURL functions do not respect redirected filenames, which seems a bit silly if "Save Link As" does...
My addons: NoRedirect | QuickDrag | URL Flipper | TabSubmit
Developers: Make sure to test your addons for RTL compatibility!
User avatar
Bluefang
Posts: 7857
Joined: August 10th, 2005, 2:55 pm
Location: Vermont
Contact:

Post by Bluefang »

Actually, one already exists...

https://bugzilla.mozilla.org/show_bug.cgi?id=311742

I commented in it earlier today.
There have always been ghosts in the machine... random segments of code that have grouped together to form unexpected protocols. Unanticipated, these free radicals engender questions of free will, creativity, and even the nature of what we might call the soul...
danb77
Posts: 32
Joined: May 19th, 2004, 7:58 am

Post by danb77 »

The pdf file extension bug has now been reported on bugzilla.

https://bugzilla.mozilla.org/show_bug.cgi?id=290609

I will add a comment there referring back to this discussion.

It is interesting that the site is sending platform specific headers. I will email them to let them know.
User avatar
Bluefang
Posts: 7857
Joined: August 10th, 2005, 2:55 pm
Location: Vermont
Contact:

Post by Bluefang »

While this is extremely strange, I don't think there is anything incorrect about it.I doubt it's uncommon to see sites providing platform specific data. However their method of doing so seems very poor.

As a side note, all of the links do work as expected. Save As has the correct file name that the sever provided and clicking the link does open the PDF in an embedded Adobe Reader.
Post Reply