Change webpages fallback encoding to Unicode?

User Help for Seamonkey and Mozilla Suite
barbaz
Posts: 1504
Joined: October 1st, 2014, 3:25 pm

Change webpages fallback encoding to Unicode?

Post by barbaz »

This page https://xkcd.com/1705/, and others on the same site, don't declare a character encoding. SeaMonkey assumes Western, and the comic title at that URL is messed.

If I manually go View > Text Encoding > Unicode, it displays correctly.

How to make SeaMonkey assume Unicode when a page doesn't declare an encoding?
User avatar
Grumpus
Posts: 13246
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Re: Change webpages fallback encoding to Unicode?

Post by Grumpus »

Go to /Edit/Preferences/Browser/Languages.
Bottom of window is : Text encoding for Legacy content /Fallback Text Encoding, Try "Other{Including Western European)
Makes this page show as Unicode and Unicode is checked under View.
It might also help to set the same for the Mail side of things but that's a guess, both the same with 2.46 this system.
It may also have something to do with the site, I seem to remember one of the Linux distributions allowed for the system setting to Unicode which seemed to take precedence.
Doesn't matter what you say, it's wrong for a toaster to walk around the house and talk to you
barbaz
Posts: 1504
Joined: October 1st, 2014, 3:25 pm

Re: Change webpages fallback encoding to Unicode?

Post by barbaz »

Grumpus wrote:Go to /Edit/Preferences/Browser/Languages.
Bottom of window is : Text encoding for Legacy content /Fallback Text Encoding,
Thanks, I went there and found that it changes the preference "intl.charset.fallback.override". But setting it to 'utf-8' or 'UTF-8' made no difference. :(
Grumpus wrote:Makes this page show as Unicode and Unicode is checked under View.
This page does declare its character encoding as Unicode.
User avatar
Grumpus
Posts: 13246
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Re: Change webpages fallback encoding to Unicode?

Post by Grumpus »

These are the setting for Unicode in A:C for SeaMonkey 2.46
network.standard-url.encode-utf8; true
network.standard-url.escape-utf8; true
prefs.converted-to-utf8; is set to false, maybe if set to true it might change it.
Doesn't matter what you say, it's wrong for a toaster to walk around the house and talk to you
barbaz
Posts: 1504
Joined: October 1st, 2014, 3:25 pm

Re: Change webpages fallback encoding to Unicode?

Post by barbaz »

Still not working.
User avatar
Grumpus
Posts: 13246
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Re: Change webpages fallback encoding to Unicode?

Post by Grumpus »

There were four items in SM2.46 about:config which had the fallback at Western 1252 , you changed one.
Note the mailnews default items are set to send in 1252 as default.
Doesn't make a difference in either SeaMonkey or in Firefox the page goes Western in the View menu.
intl.charset.fallback.override
intl.fallbackCharsetList.ISO-8859-1
mailnews.send_default_charset
mailnews.view_default_charset

If you go into about:config and type "charset" there does not appear to be any Utf-8 character encoding. (searched)
There is a fallback item and a filter for utf-7 but as a default does not appear to be any related Utf-8 encoding items I could find.
I'm wondering if the defaults need to be changed from 1252 and whether they can be?

Also, intl.charset.detector is a string value and is empty by default.
Doesn't matter what you say, it's wrong for a toaster to walk around the house and talk to you
barbaz
Posts: 1504
Joined: October 1st, 2014, 3:25 pm

Re: Change webpages fallback encoding to Unicode?

Post by barbaz »

Hmm. I found this hack for Firefox - https://bugzilla.mozilla.org/show_bug.c ... 071816#c16
But I don't really understand what it's doing, and parts of it deal with Firefox-specific code.

Since I self build SeaMonkey, I can, in theory, "just try it". But how to translate it for SeaMonkey?
Anonymosity
Posts: 8779
Joined: May 7th, 2007, 12:07 pm

Re: Change webpages fallback encoding to Unicode?

Post by Anonymosity »

I tried changing "intl.charset.fallback.override" and "intl.fallbackCharsetList.ISO-8859-1" both from "windows-1252" to "ISO-8859-1", but that made no difference. Why should Windows font codes be the default anyway? I am using MacOS, and the upper half of the font codes are completely different from Windows font codes.
User avatar
Grumpus
Posts: 13246
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Re: Change webpages fallback encoding to Unicode?

Post by Grumpus »

@barbaz - if I understand the code it's removing a blocker in the code for Utf-8 in #1
. . . and placing it back in the listing under Fonts & Colors/advanced in #2
You would have to get into the area or module used to control the settings(guessing) to edit in the changes.
There is also a follow up change in the third and a change in about:config.
It's beyond me as far as where or what file to edit and I wonder how long it would hold up after more updates?
It would seem to be able to read many sites there's an automatic changeover on a per site basis and it may be necessary for crossplatform.

Edit: due to more thinking. :-k
Doesn't matter what you say, it's wrong for a toaster to walk around the house and talk to you
User avatar
Grumpus
Posts: 13246
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Re: Change webpages fallback encoding to Unicode?

Post by Grumpus »

OK I think I somewhat solved this. Kept notes and tried the linked page.
It may be only a partial solution but it seemed to allow a reload to unicode.
Makes sense as most of my system has always been set to unicode for some of the different character displays.
Opened in Windows but allowed a change in the View/Text Encoding to Unicode and a reload of the page held at Unicode.
In about:config changed the following items. Make sure you keep notes.
intl.charset.fallback.override set to UTF-8 (uppercase caps as displayed)
intl.charset.fallback.tld; true
intl.fallbackCharsetList.ISO-8859-1; leave blank
mailnews.force_charset_override;false
mailnews.reply_in_default_charset;true
mailnews.send_default_charset; UTF-8 (When right clicking and hitting reset it went to UTF-8 was windows-1252)
mailnews.view_default_charset; UTF-8 (this also at windows-1252 but a reset took it to ISO-8859-1)
prefs.converted-to-utf8;true (was set to false)

This is for SeaMonkey
Doesn't matter what you say, it's wrong for a toaster to walk around the house and talk to you
barbaz
Posts: 1504
Joined: October 1st, 2014, 3:25 pm

Re: Change webpages fallback encoding to Unicode?

Post by barbaz »

Grumpus wrote:@barbaz - if I understand the code it's removing a blocker in the code for Utf-8 in #1
. . . and placing it back in the listing under Fonts & Colors/advanced in #2
[...]
There is also a follow up change in the third and a change in about:config.
Thanks for the explanation! I think I know now how to apply that fix to SeaMonkey. I'll do their step 1 and then add a Unicode option to comm-(whatever)'s suite/common/pref/pref-languages.xul
Grumpus wrote:Opened in Windows but allowed a change in the View/Text Encoding to Unicode and a reload of the page held at Unicode.
That's the behavior I get without modifying any about:config prefs. I'm hoping to be able to skip the "View > Text Encoding > Unicode" part.
User avatar
Grumpus
Posts: 13246
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Re: Change webpages fallback encoding to Unicode?

Post by Grumpus »

If you get the same activity without the changes to about:config makes me think the changes were a waste but I'll leave them alone for a while and see how things go.
I've never been worried by the "special character" stand-ins o0n MS built sites, sort of a badge of honor. ;)

Please outline what files and editor you use to make the code changes if it's successful?
Doesn't matter what you say, it's wrong for a toaster to walk around the house and talk to you
Anonymosity
Posts: 8779
Joined: May 7th, 2007, 12:07 pm

Re: Change webpages fallback encoding to Unicode?

Post by Anonymosity »

Grumpus wrote:OK I think I somewhat solved this. Kept notes and tried the linked page.
It may be only a partial solution but it seemed to allow a reload to unicode.
Makes sense as most of my system has always been set to unicode for some of the different character displays.
Opened in Windows but allowed a change in the View/Text Encoding to Unicode and a reload of the page held at Unicode.
In about:config changed the following items. Make sure you keep notes.
intl.charset.fallback.override set to UTF-8 (uppercase caps as displayed)
intl.charset.fallback.tld; true
intl.fallbackCharsetList.ISO-8859-1; leave blank
mailnews.force_charset_override;false
mailnews.reply_in_default_charset;true
mailnews.send_default_charset; UTF-8 (When right clicking and hitting reset it went to UTF-8 was windows-1252)
mailnews.view_default_charset; UTF-8 (this also at windows-1252 but a reset took it to ISO-8859-1)
prefs.converted-to-utf8;true (was set to false)

This is for SeaMonkey
What am I missing? I did all those settings, but the Pokémon titie still looks like this: "Pokémon".
User avatar
Grumpus
Posts: 13246
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Re: Change webpages fallback encoding to Unicode?

Post by Grumpus »

Open View and look at the Character encoding, probably on Western, hit Unicode.
Just tried it and it changed immediately with SeaMonkey and Firefox.
I'm thinking the automatic to UTF-8 may only occur if barbaz gets the code to work

ps: just noticed the title changed in History as well.
Doesn't matter what you say, it's wrong for a toaster to walk around the house and talk to you
barbaz
Posts: 1504
Joined: October 1st, 2014, 3:25 pm

Re: Change webpages fallback encoding to Unicode?

Post by barbaz »

Sorry it took so long for me to get back to this. I accidentally deleted my build VM (not for the first time) #-o

Anyway, the patches work! Thanks for your help Grumpus! :D
Grumpus wrote:Please outline what files and editor you use to make the code changes if it's successful?
Assuming you've already checked out the source with Mercurial:

1) Run the first command from that bug, and commit the change in your mozilla repository -

Code: Select all

cd comm-*/mozilla
sed -i 's|(mFallback).*$|(mFallback)) {|;/UTF-8/d' dom/encoding/FallbackEncoding.cpp
hg ci
2) Apply this patch to your comm-* repository -

Code: Select all

# HG changeset patch
# User barbaz
# Date 1488482902 0
#      Thu Mar 02 19:28:22 2017 +0000
# Parent  101aeb596a394a0bba86ba545647c9329269ec94
enable UTF-8 fallback - GUI part

diff --git a/suite/common/pref/pref-languages.xul b/suite/common/pref/pref-languages.xul
--- a/suite/common/pref/pref-languages.xul
+++ b/suite/common/pref/pref-languages.xul
@@ -114,6 +114,7 @@
             <menuitem label="&FallbackCharset.thai;"        value="windows-874"/>
             <menuitem label="&FallbackCharset.turkish;"     value="windows-1254"/>
             <menuitem label="&FallbackCharset.vietnamese;"  value="windows-1258"/>
+            <menuitem label="UTF-8"                         value="UTF-8"/>
             <menuitem label="&FallbackCharset.other;"       value="windows-1252"/>
           </menupopup>
         </menulist>
3) Build SeaMonkey (instructions for this are available elsewhere)

4) From the new build, follow Grumpus' instructions here - http://forums.mozillazine.org/viewtopic ... #p14733916
But unlike with the stock build, you will have a 'UTF-8' option in the dropdown. Select that.

5) Clear all browsing history data, then reload any affected pages.
Post Reply