MozillaZine

Change webpages fallback encoding to Unicode?

User Help for Seamonkey and Mozilla Suite
barbaz
 
Posts: 1677
Joined: October 1st, 2014, 3:25 pm

Post Posted February 19th, 2017, 7:28 pm

This page https://xkcd.com/1705/, and others on the same site, don't declare a character encoding. SeaMonkey assumes Western, and the comic title at that URL is messed.

If I manually go View > Text Encoding > Unicode, it displays correctly.

How to make SeaMonkey assume Unicode when a page doesn't declare an encoding?
*Always* check the changelogs BEFORE updating that important software!

Grumpus

User avatar
 
Posts: 11592
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Post Posted February 20th, 2017, 11:13 am

Go to /Edit/Preferences/Browser/Languages.
Bottom of window is : Text encoding for Legacy content /Fallback Text Encoding, Try "Other{Including Western European)
Makes this page show as Unicode and Unicode is checked under View.
It might also help to set the same for the Mail side of things but that's a guess, both the same with 2.46 this system.
It may also have something to do with the site, I seem to remember one of the Linux distributions allowed for the system setting to Unicode which seemed to take precedence.

barbaz
 
Posts: 1677
Joined: October 1st, 2014, 3:25 pm

Post Posted February 20th, 2017, 11:31 am

Grumpus wrote:Go to /Edit/Preferences/Browser/Languages.
Bottom of window is : Text encoding for Legacy content /Fallback Text Encoding,

Thanks, I went there and found that it changes the preference "intl.charset.fallback.override". But setting it to 'utf-8' or 'UTF-8' made no difference. :(

Grumpus wrote:Makes this page show as Unicode and Unicode is checked under View.

This page does declare its character encoding as Unicode.
*Always* check the changelogs BEFORE updating that important software!

Grumpus

User avatar
 
Posts: 11592
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Post Posted February 20th, 2017, 11:42 am

These are the setting for Unicode in A:C for SeaMonkey 2.46
network.standard-url.encode-utf8; true
network.standard-url.escape-utf8; true
prefs.converted-to-utf8; is set to false, maybe if set to true it might change it.

barbaz
 
Posts: 1677
Joined: October 1st, 2014, 3:25 pm

Post Posted February 20th, 2017, 12:17 pm

Still not working.
*Always* check the changelogs BEFORE updating that important software!

Grumpus

User avatar
 
Posts: 11592
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Post Posted February 21st, 2017, 6:14 am

There were four items in SM2.46 about:config which had the fallback at Western 1252 , you changed one.
Note the mailnews default items are set to send in 1252 as default.
Doesn't make a difference in either SeaMonkey or in Firefox the page goes Western in the View menu.
intl.charset.fallback.override
intl.fallbackCharsetList.ISO-8859-1
mailnews.send_default_charset
mailnews.view_default_charset

If you go into about:config and type "charset" there does not appear to be any Utf-8 character encoding. (searched)
There is a fallback item and a filter for utf-7 but as a default does not appear to be any related Utf-8 encoding items I could find.
I'm wondering if the defaults need to be changed from 1252 and whether they can be?

Also, intl.charset.detector is a string value and is empty by default.

barbaz
 
Posts: 1677
Joined: October 1st, 2014, 3:25 pm

Post Posted February 21st, 2017, 9:34 am

Hmm. I found this hack for Firefox - https://bugzilla.mozilla.org/show_bug.cgi?id=1071816#c16
But I don't really understand what it's doing, and parts of it deal with Firefox-specific code.

Since I self build SeaMonkey, I can, in theory, "just try it". But how to translate it for SeaMonkey?
*Always* check the changelogs BEFORE updating that important software!

Anonymosity
 
Posts: 8452
Joined: May 7th, 2007, 12:07 pm

Post Posted February 21st, 2017, 10:04 am

I tried changing "intl.charset.fallback.override" and "intl.fallbackCharsetList.ISO-8859-1" both from "windows-1252" to "ISO-8859-1", but that made no difference. Why should Windows font codes be the default anyway? I am using MacOS, and the upper half of the font codes are completely different from Windows font codes.

Grumpus

User avatar
 
Posts: 11592
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Post Posted February 21st, 2017, 11:19 am

@barbaz - if I understand the code it's removing a blocker in the code for Utf-8 in #1
. . . and placing it back in the listing under Fonts & Colors/advanced in #2
You would have to get into the area or module used to control the settings(guessing) to edit in the changes.
There is also a follow up change in the third and a change in about:config.
It's beyond me as far as where or what file to edit and I wonder how long it would hold up after more updates?
It would seem to be able to read many sites there's an automatic changeover on a per site basis and it may be necessary for crossplatform.

Edit: due to more thinking. :-k

Grumpus

User avatar
 
Posts: 11592
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Post Posted February 22nd, 2017, 1:24 pm

OK I think I somewhat solved this. Kept notes and tried the linked page.
It may be only a partial solution but it seemed to allow a reload to unicode.
Makes sense as most of my system has always been set to unicode for some of the different character displays.
Opened in Windows but allowed a change in the View/Text Encoding to Unicode and a reload of the page held at Unicode.
In about:config changed the following items. Make sure you keep notes.
intl.charset.fallback.override set to UTF-8 (uppercase caps as displayed)
intl.charset.fallback.tld; true
intl.fallbackCharsetList.ISO-8859-1; leave blank
mailnews.force_charset_override;false
mailnews.reply_in_default_charset;true
mailnews.send_default_charset; UTF-8 (When right clicking and hitting reset it went to UTF-8 was windows-1252)
mailnews.view_default_charset; UTF-8 (this also at windows-1252 but a reset took it to ISO-8859-1)
prefs.converted-to-utf8;true (was set to false)

This is for SeaMonkey

barbaz
 
Posts: 1677
Joined: October 1st, 2014, 3:25 pm

Post Posted February 22nd, 2017, 2:34 pm

Grumpus wrote:@barbaz - if I understand the code it's removing a blocker in the code for Utf-8 in #1
. . . and placing it back in the listing under Fonts & Colors/advanced in #2
[...]
There is also a follow up change in the third and a change in about:config.

Thanks for the explanation! I think I know now how to apply that fix to SeaMonkey. I'll do their step 1 and then add a Unicode option to comm-(whatever)'s suite/common/pref/pref-languages.xul

Grumpus wrote:Opened in Windows but allowed a change in the View/Text Encoding to Unicode and a reload of the page held at Unicode.

That's the behavior I get without modifying any about:config prefs. I'm hoping to be able to skip the "View > Text Encoding > Unicode" part.
*Always* check the changelogs BEFORE updating that important software!

Grumpus

User avatar
 
Posts: 11592
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Post Posted February 23rd, 2017, 6:05 am

If you get the same activity without the changes to about:config makes me think the changes were a waste but I'll leave them alone for a while and see how things go.
I've never been worried by the "special character" stand-ins o0n MS built sites, sort of a badge of honor. ;)

Please outline what files and editor you use to make the code changes if it's successful?

Anonymosity
 
Posts: 8452
Joined: May 7th, 2007, 12:07 pm

Post Posted February 23rd, 2017, 2:50 pm

Grumpus wrote:OK I think I somewhat solved this. Kept notes and tried the linked page.
It may be only a partial solution but it seemed to allow a reload to unicode.
Makes sense as most of my system has always been set to unicode for some of the different character displays.
Opened in Windows but allowed a change in the View/Text Encoding to Unicode and a reload of the page held at Unicode.
In about:config changed the following items. Make sure you keep notes.
intl.charset.fallback.override set to UTF-8 (uppercase caps as displayed)
intl.charset.fallback.tld; true
intl.fallbackCharsetList.ISO-8859-1; leave blank
mailnews.force_charset_override;false
mailnews.reply_in_default_charset;true
mailnews.send_default_charset; UTF-8 (When right clicking and hitting reset it went to UTF-8 was windows-1252)
mailnews.view_default_charset; UTF-8 (this also at windows-1252 but a reset took it to ISO-8859-1)
prefs.converted-to-utf8;true (was set to false)

This is for SeaMonkey

What am I missing? I did all those settings, but the Pokémon titie still looks like this: "Pokémon".

Grumpus

User avatar
 
Posts: 11592
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Post Posted February 24th, 2017, 5:28 am

Open View and look at the Character encoding, probably on Western, hit Unicode.
Just tried it and it changed immediately with SeaMonkey and Firefox.
I'm thinking the automatic to UTF-8 may only occur if barbaz gets the code to work

ps: just noticed the title changed in History as well.

barbaz
 
Posts: 1677
Joined: October 1st, 2014, 3:25 pm

Post Posted March 2nd, 2017, 9:21 pm

Sorry it took so long for me to get back to this. I accidentally deleted my build VM (not for the first time) #-o

Anyway, the patches work! Thanks for your help Grumpus! :D

Grumpus wrote:Please outline what files and editor you use to make the code changes if it's successful?

Assuming you've already checked out the source with Mercurial:

1) Run the first command from that bug, and commit the change in your mozilla repository -
Code: Select all
cd comm-*/mozilla
sed -i 's|(mFallback).*$|(mFallback)) {|;/UTF-8/d' dom/encoding/FallbackEncoding.cpp
hg ci


2) Apply this patch to your comm-* repository -
Code: Select all
# HG changeset patch
# User barbaz
# Date 1488482902 0
#      Thu Mar 02 19:28:22 2017 +0000
# Parent  101aeb596a394a0bba86ba545647c9329269ec94
enable UTF-8 fallback - GUI part

diff --git a/suite/common/pref/pref-languages.xul b/suite/common/pref/pref-languages.xul
--- a/suite/common/pref/pref-languages.xul
+++ b/suite/common/pref/pref-languages.xul
@@ -114,6 +114,7 @@
             <menuitem label="&FallbackCharset.thai;"        value="windows-874"/>
             <menuitem label="&FallbackCharset.turkish;"     value="windows-1254"/>
             <menuitem label="&FallbackCharset.vietnamese;"  value="windows-1258"/>
+            <menuitem label="UTF-8"                         value="UTF-8"/>
             <menuitem label="&FallbackCharset.other;"       value="windows-1252"/>
           </menupopup>
         </menulist>


3) Build SeaMonkey (instructions for this are available elsewhere)

4) From the new build, follow Grumpus' instructions here - viewtopic.php?p=14733916#p14733916
But unlike with the stock build, you will have a 'UTF-8' option in the dropdown. Select that.

5) Clear all browsing history data, then reload any affected pages.
*Always* check the changelogs BEFORE updating that important software!

Return to SeaMonkey Support


Who is online

Users browsing this forum: No registered users and 1 guest