Save web page as txt - no longer strips html/code (Linux)

Discussion of general topics about Mozilla Firefox
User avatar
Grumpus
Posts: 13246
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Re: Save web page as txt - no longer strips html/code

Post by Grumpus »

Didn't see anything in the repos about Concy but a google search gave numerous results.
One from the Urban dictionary said to think about things deeply and the pertinent one appeared to have something to do with a program to enhance Instagram.
Seems to me you're over complicating issues and gone off on several tangents trying to affect change in something which may be default at present without the use of an extension.
Doesn't matter what you say, it's wrong for a toaster to walk around the house and talk to you
User avatar
Nettkrawler
Posts: 137
Joined: May 3rd, 2010, 3:31 pm

Re: Save web page as txt - no longer strips html/code

Post by Nettkrawler »

Grumpus - I'll be very thankful if you can provide me an extension that can do this conversion. I have tried to find one, but nothing found.

Well - I have tried some more things as well.

I did a check on the PATH variable for the users to see if something is missing.
There is just one item in User1's PATH that doesn't also exists in PATH variable for User2 - and that is "/snap/bin", but that directory doesn't seems to exist. So I beleive it is not an error in the PATH variable that causes this.

Another fact - this is the version of Firefox that ships with the installation of Linux Lite (same as for Ubuntu).
User avatar
Grumpus
Posts: 13246
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Re: Save web page as txt - no longer strips html/code

Post by Grumpus »

Snap is being used by some Linux distros, Ubuntu included, to install "works on any system" software.
However, there was a warning on the Register a couple of weeks ago about some form of exploit to Snap.
It may be future at this point or possibly disabled by the Linux Lite developers.

Frankly I don't ever remember Firefox ever opening an html page with plain text.
I used to just copy the text to a simple text editor like Gedit or leafpad.
Doesn't matter what you say, it's wrong for a toaster to walk around the house and talk to you
User avatar
Nettkrawler
Posts: 137
Joined: May 3rd, 2010, 3:31 pm

Re: Save web page as txt - no longer strips html/code (Linux

Post by Nettkrawler »

Yes - I know I can copy and paste into any plain text editors.

But I always prefer to download directly - the nice thing is that the links are preserved in the text, which isn't the case if I copy and paste as plain text.

And most important of all - I like to fix things rather than going around the problem. This may not be fixable, but at least I won't give up before all reasonable ways of troubleshoot is tried out without positive results.

That sait - Really wonder what happens if I remove Firefox (the Ubuntu edition) and install the main edition (if it may be called that) for any distros where Firefox isn't included from start. I recalled once I saw a description how to do so.
User avatar
Grumpus
Posts: 13246
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Re: Save web page as txt - no longer strips html/code (Linux

Post by Grumpus »

I don't know how it will affect all your user profiles other than having to make new ones for the new install unless you can transfer all the information.
Probably maintain things like bookmarks, etc. since it's not really a different platform. Maybe one of these days Mozilla will make a .deb for all package.
When installing the Mozilla direct version it is either installed locally, doesn't make all the third party OS connections, or you have to go through a moderately complicated process.
You could search on Ubuntuzilla for an alternative, there use to be one which works like SeaMonkey did initially and places everything in the Opt folder.
Or you might try and use Ice Weasel, the Debian version, from the SourceForge or Debian Repos.
There's also the possibility someone else has the same issue and you could check out Github for an extension.
Doesn't matter what you say, it's wrong for a toaster to walk around the house and talk to you
kerft
Posts: 585
Joined: January 30th, 2019, 9:38 am

Re: Save web page as txt - no longer strips html/code (Linux

Post by kerft »

I did a test. In Windows Firefox 65 save as text works as expected. In Fedora Firefox 64, distro version, save as text does as you say - saves an html file and a folder with the extras. I have not yet tried Mozilla version on Linux. I think instead of installing it, you can extract it, and run it with -p to test with a temporary profile. If the Mozilla version does the same, check bugzilla for existing bugs and file one.
User avatar
smsmith
Moderator
Posts: 19979
Joined: December 7th, 2004, 8:51 pm
Location: Indiana

Re: Save web page as txt - no longer strips html/code (Linux

Post by smsmith »

kerft wrote:I think instead of installing it, you can extract it, and run it with -p to test with a temporary profile.
Yes, that's all you need to do to run the Mozilla Release version.
Give a man a fish, and he eats for a day. Teach a man to fish, and he eats for a lifetime.
I like poetry, long walks on the beach and poking dead things with a stick.
Please do not PM me for personal support. Keep posts here in the Forums instead and we all learn.
User avatar
Grumpus
Posts: 13246
Joined: October 19th, 2007, 4:23 am
Location: ... Da' Swamp

Re: Save web page as txt - no longer strips html/code (Linux

Post by Grumpus »

The text is not saved as only text without the html code showing in Mint or Ubuntu. (3 systems tried) if just trying to save as text.
However, it seems to me, not being all that adept, it's saving it as text only or the code wouldn't show, you would see the graphics of the page.
I do get two different results. If I save the file as text and open it with gedit I get the plain text but with indentations and highlighted lines of the html .
If I remove the html extension and add .txt it shows all in plain text without any highlighting, essentially plain text.
Doesn't matter what you say, it's wrong for a toaster to walk around the house and talk to you
User avatar
therube
Posts: 21714
Joined: March 10th, 2004, 9:59 pm
Location: Maryland USA

Re: Save web page as txt - no longer strips html/code (Linux

Post by therube »

If I save the file as text and open it with gedit
Seems that gedit has syntax highlighting.
https://upload.wikimedia.org/wikipedia/ ... .11.92.png
Guessing that on "knowing" the file is ".html" (.html extension), it automatically applies those rules.
And when you tell it the file is ".txt", so it displays just that, "text" (without highlighting).
Fire 750, bring back 250.
Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.19) Gecko/20110420 SeaMonkey/2.0.14 Pinball CopyURL+ FetchTextURL FlashGot NoScript
User avatar
Nettkrawler
Posts: 137
Joined: May 3rd, 2010, 3:31 pm

using the strace tool in Linux

Post by Nettkrawler »

Update.

In another, local forum that I was bugging because of the same issue, there was a guy that suggested using the strace tool to kind of debugging the firefox process to see if that outputs different results when saving web page as local text file.

For both users - I did the following:
~ Opening Firefox through the strace command "strace -o ff_trace_1.txt firefox". This generates two huge text files, ff_trace_1.txt and ff_trace_2.txt, filling up 23 and 26 MB of disk space.
~ Saving the first avaiable web page after opening Firefox as a local text file, named tekstfil1.txt og tekstfil2.txt (the latter one containing html-tags and the sidecar folder).

Then I used grep to search within the huge files ff_trace_1.and og ff_trace_2.txt for the file names "tekstfil1.txt" and "tekstfil2.txt". Using the -A2 and -B2 to include 2 lines before and after each found, and -n to make grep also output line numbers.

fellesmappe simply means "common folder" (in the sense that several users have access) in norwegian - the folder that both users have access to because member of same group


grep -A2 -B2 -n 'tekstfil1.txt' ff_trace_1.txt

Code: Select all

297505-write(64, "\1\0\0\0\0\0\0\0", 8)        = 8
297506-futex(0x7f84255a4870, FUTEX_WAKE_PRIVATE, 1) = 1
297507:access("/lager/fellesmappe/tekstfil1.txt", F_OK) = -1 ENOENT (No such file or directory)
297508-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1208, tv_nsec=755039528}) = 0
297509-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1208, tv_nsec=755170271}) = 0
--
297563-mprotect(0x3d4da81df000, 4096, PROT_READ|PROT_WRITE) = 0
297564-mprotect(0x3d4da81df000, 4096, PROT_READ|PROT_EXEC) = 0
297565:stat("/lager/fellesmappe/tekstfil1.txt", 0x7f841eda2118) = -1 ENOENT (No such file or directory)
297566:lstat("/lager/fellesmappe/tekstfil1.txt", 0x7f841eda2118) = -1 ENOENT (No such file or directory)
297567-mprotect(0x3d4da81df000, 8192, PROT_READ|PROT_WRITE) = 0
297568-mprotect(0x3d4da81df000, 8192, PROT_READ|PROT_EXEC) = 0
--
297612-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1208, tv_nsec=853116432}) = 0
297613-write(16, "\0", 1)                      = 1
297614:access("/lager/fellesmappe/tekstfil1.txt", F_OK) = -1 ENOENT (No such file or directory)
297615-mkdir("/lager/fellesmappe", 0755)   = -1 EEXIST (File exists)
297616:openat(AT_FDCWD, "/lager/fellesmappe/tekstfil1.txt", O_WRONLY|O_CREAT|O_TRUNC, 0664) = 155
297617-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1208, tv_nsec=854358774}) = 0
297618-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1208, tv_nsec=854493010}) = 0
--
297653-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1208, tv_nsec=875573220}) = 0
297654-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1208, tv_nsec=875694115}) = 0
297655:stat("/lager/fellesmappe/tekstfil1.txt", {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
297656-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1208, tv_nsec=877083264}) = 0
297657-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1208, tv_nsec=877207862}) = 0


grep -A2 -B2 -n 'tekstfil2.txt' ff_trace_2.txt

Code: Select all

241040-write(43, "\1\0\0\0\0\0\0\0", 8)        = 8
241041-futex(0x7f984585d190, FUTEX_WAKE_PRIVATE, 1) = 1
241042:access("/lager/fellesmappe/tekstfil2.txt", F_OK) = -1 ENOENT (No such file or directory)
241043-gettimeofday({tv_sec=1552749694, tv_usec=775398}, NULL) = 0
241044-futex(0x7f984582cf34, FUTEX_WAKE_PRIVATE, 1) = 1
--
241145-mprotect(0x2155d7db5000, 4096, PROT_READ|PROT_WRITE) = 0
241146-mprotect(0x2155d7db5000, 4096, PROT_READ|PROT_EXEC) = 0
241147:stat("/lager/fellesmappe/tekstfil2.txt", 0x7f98328b2598) = -1 ENOENT (No such file or directory)
241148:lstat("/lager/fellesmappe/tekstfil2.txt", 0x7f98328b2598) = -1 ENOENT (No such file or directory)
241149-mprotect(0x2155d7db5000, 8192, PROT_READ|PROT_WRITE) = 0
241150-mprotect(0x2155d7db5000, 8192, PROT_READ|PROT_EXEC) = 0
--
241236-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1500, tv_nsec=17901072}) = 0
241237-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1500, tv_nsec=18039638}) = 0
241238:stat("/lager/fellesmappe/tekstfil2.txt", 0x7f98328caa18) = -1 ENOENT (No such file or directory)
241239:lstat("/lager/fellesmappe/tekstfil2.txt", 0x7f98328caa18) = -1 ENOENT (No such file or directory)
241240-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1500, tv_nsec=19668554}) = 0
241241-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1500, tv_nsec=19817875}) = 0
--
252774-write(109, "\n(function() {\nvar f = document."..., 2644) = 2644
252775-close(109)                              = 0
252776:access("/lager/fellesmappe/tekstfil2.txt", F_OK) = -1 ENOENT (No such file or directory)
252777-mkdir("/lager/fellesmappe", 0755)   = -1 EEXIST (File exists)
252778:openat(AT_FDCWD, "/lager/fellesmappe/tekstfil2.txt", O_WRONLY|O_CREAT|O_TRUNC, 0664) = 109
252779-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1502, tv_nsec=331486506}) = 0
252780-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1502, tv_nsec=331623466}) = 0


grep -A2 -B2 -n 'tekstfil2_files' ff_trace_2.txt
Searching for the name fo the sidecar folder. Excluding any matches after fourth match, because every matches after that seems to be repetitions of match #3 and match #4, only file name differs.

Code: Select all

241169-mprotect(0x2155d7db7000, 4096, PROT_READ|PROT_WRITE) = 0
241170-mprotect(0x2155d7db7000, 4096, PROT_READ|PROT_EXEC) = 0
241171:stat("/lager/fellesmappe/tekstfil2_files", 0x7f98328ca358) = -1 ENOENT (No such file or directory)
241172:lstat("/lager/fellesmappe/tekstfil2_files", 0x7f98328ca358) = -1 ENOENT (No such file or directory)
241173-madvise(0x7f983e9e9000, 8192, MADV_DONTNEED) = 0
241174-madvise(0x7f983e9ba000, 8192, MADV_DONTNEED) = 0
--
241918-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1500, tv_nsec=223816454}) = 0
241919-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1500, tv_nsec=223929667}) = 0
241920:access("/lager/fellesmappe/tekstfil2_files", F_OK) = -1 ENOENT (No such file or directory)
241921:mkdir("/lager/fellesmappe/tekstfil2_files", 0755) = 0
241922-gettimeofday({tv_sec=1552749695, tv_usec=168621}, NULL) = 0
241923-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1500, tv_nsec=224861843}) = 0
--
245727-mprotect(0x2155d7d5f000, 4096, PROT_READ|PROT_EXEC) = 0
245728-gettimeofday({tv_sec=1552749695, tv_usec=837886}, NULL) = 0
245729:stat("/lager/fellesmappe/tekstfil2_files/google_powered_by.png", 0x7f983798f898) = -1 ENOENT (No such file or directory)
245730:lstat("/lager/fellesmappe/tekstfil2_files/google_powered_by.png", 0x7f983798f898) = -1 ENOENT (No such file or directory)
245731-gettimeofday({tv_sec=1552749695, tv_usec=838435}, NULL) = 0
245732-gettimeofday({tv_sec=1552749695, tv_usec=838701}, NULL) = 0
--
245783-mprotect(0x2155d7d61000, 8192, PROT_READ|PROT_EXEC) = 0
245784-gettimeofday({tv_sec=1552749695, tv_usec=861628}, NULL) = 0
245785:mkdir("/lager/fellesmappe/tekstfil2_files", 0755) = -1 EEXIST (File exists)
245786:openat(AT_FDCWD, "/lager/fellesmappe/tekstfil2_files/google_powered_by.png", O_WRONLY|O_CREAT|O_TRUNC, 0664) = 160
245787-clock_gettime(CLOCK_MONOTONIC, {tv_sec=1500, tv_nsec=918304478}) = 0
245788-gettimeofday({tv_sec=1552749695, tv_usec=862302}, NULL) = 0
User avatar
therube
Posts: 21714
Joined: March 10th, 2004, 9:59 pm
Location: Maryland USA

Re: Save web page as txt - no longer strips html/code (Linux

Post by therube »

Can you zip up your Profile (removing any private data first, if needed, like logins.json) & post it some place?
Fire 750, bring back 250.
Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.19) Gecko/20110420 SeaMonkey/2.0.14 Pinball CopyURL+ FetchTextURL FlashGot NoScript
User avatar
Nettkrawler
Posts: 137
Joined: May 3rd, 2010, 3:31 pm

Re: Save web page as txt - no longer strips html/code (Linux

Post by Nettkrawler »

therube wrote:Can you zip up your Profile (removing any private data first, if needed, like logins.json) & post it some place?
The issue seems to be affecting any new created user profile (OS) as well as if any existing user profiles (OS) get to delete all content in the Firefox user profile folder. That is - if starting off with zero files inside, the problem still persists.

That is every users on the computer exept from user1 - for some strange reason, that user profile is not affected by the issue :?
User avatar
dickvl
Posts: 54161
Joined: July 18th, 2005, 3:25 am

Re: Save web page as txt - no longer strips html/code (Linux

Post by dickvl »

Did you try the Firefox version from the Mozilla server?
https://www.mozilla.org/en-US/firefox/all/
User avatar
Nettkrawler
Posts: 137
Joined: May 3rd, 2010, 3:31 pm

Re: Save web page as txt - no longer strips html/code (Linux

Post by Nettkrawler »

Update.

Since I got a new laptop and installed Fedora 30 desktop - (FF 80.0 at this point) - I haven't had this issue, even if the user accounts setup is pretty much the same. Use Cinnamon desktop.
User avatar
DanRaisch
Moderator
Posts: 127224
Joined: September 23rd, 2004, 8:57 pm
Location: Somewhere on the right coast

Re: Save web page as txt - no longer strips html/code (Linux

Post by DanRaisch »

Thanks for the update.

Locking this thread due to the age of the original posts.
Locked