On "Page Info > Links", saving Name and Address

Discussion of general topics about Mozilla Firefox
berwin
Posts: 3
Joined: April 5th, 2006, 11:02 pm

On "Page Info > Links", saving Name and Address

Post by berwin »

When right-clicking on a page > View Page Info > Links, a window shows up with a description on the left side and its link on the right side. When there are many links on the page, many descriptions and their links are listed.
My question is: Is it possible to save both descriptions and their links?
So far I have only been able through "Select All > Copy" and paste into Notepad or Excel to receive only the links, but never the descriptions of the links.
User avatar
the-edmeister
Posts: 32249
Joined: February 25th, 2003, 12:51 am
Location: Chicago, IL, USA

Post by the-edmeister »

Maximize the window and take a Screenshot.


Ed
A mind is a terrible thing to waste. Mine has wandered off and I'm out looking for it.
berwin
Posts: 3
Joined: April 5th, 2006, 11:02 pm

Post by berwin »

Thanks, I know it would work, but on a page with hundreds of links the process would become cumbersome, and then the conversion to type from graphics, oh boy...
I guess there is not a simple method...
User avatar
jscher2000
Posts: 11762
Joined: December 19th, 2004, 12:26 am
Location: Silicon Valley, CA USA
Contact:

Post by jscher2000 »

Didn't Netscape 4.x have the option to print pages with a list of all the links at the end? That was handy.

With a little bit of DOM programming, it definitely should be possible to extract all the links, leaving just the question of how to format the results. When I do applications like this, I tend to use the internet controls supplied with IE, programmed from a VBA host like Microsoft Word (just 'cause I'm so familiar with it). But JavaScript should work, too. I'm not familiar with all the respective powers of bookmarklets, Greasemonkey scripts, and extensions, but at least one of them should have the privileges necessary to do it.
texmex
Posts: 403
Joined: March 26th, 2006, 8:58 am
Location: NE Lincs

Post by texmex »

Well here's a quick and dirty solution:
Copy the contents of this code to your clipboard
Go into Bookmark manager and create a new bookmark
Give it any name you like
Paste my code into the "Location" box.
and close.

Now if you click that book mark, it will open a new window and paste in the links and their text strings as a table. You can then copy and paste the table.

Code: Select all

javascript:x=document.getElementsByTagName(%22A%22);y=window.open();y.document.write(%22<HTML><HEAD></HEAD><BODY><table>%22);for(n=0;n<x.length;n++){y.document.write(%22<tr><td>%22+x[n].text+%22</td><td>%22+x[n].href+%22</td></tr>%22);}y.document.write(%22</table></BODY></HTML>%22);y.document.close();void 0;


Sorry it's so wide but it's imperitive that you have no spaces in this line of code.

It only checks for Anchors tags (not other types of links)
Hope this helps.
User avatar
Thumper
Posts: 8037
Joined: November 4th, 2002, 5:42 pm
Location: Linlithgow, Scotland
Contact:

Post by Thumper »

Remind me to file a bug about the Page Info UI by the way. It's been in dire need of refactoring for years.

- Chris
User avatar
jscher2000
Posts: 11762
Joined: December 19th, 2004, 12:26 am
Location: Silicon Valley, CA USA
Contact:

Post by jscher2000 »

texmex wrote:Well here's a quick and dirty solution:

Way cool. Here's an alternate version that doesn't use a table, skips the "name" type anchors that lack an href, and adds a little clickable link.

Code: Select all

javascript:loc=location.href;x=document.getElementsByTagName(%22A%22);y=window.open();y.document.write(%22<html><head><title>Links!<title></head><body><h3>Links from %22+loc+%22</h3>%22);for(n=0;n<x.length;n++){if(x[n].href!="")y.document.write(%22<p>Text: %22+x[n].text+%22<br>\nURL: %22+x[n].href+%22 <a href=\%22%22+x[n].href+%22\%22>Go!</a></p>\n%22);}y.document.write(%22</body></html>%22);y.document.close();void 0;

(The few spaces in there are correct as some phrases and tags do have spaces in them. For purity, you could replace them with %20 to create a truly correct URL.)

I wonder why the new document appears behind the original one?
texmex
Posts: 403
Joined: March 26th, 2006, 8:58 am
Location: NE Lincs

Post by texmex »

OK so it's not as dirty, but since I got in first, it wasn't as quick either ;-) :-)

I chose to put it into a table as I noticed that berwin mentioned Excel. If you select all and copy my resultant window you can then goto Excel and do a Paste Special.. Text.. and it will all nicely split up into the columns. Wish I'd thought of removing the empty links though. I daresay a hybrid of the two solutions could be quite useful.

jscher2000 wrote:I wonder why the new document appears behind the original one?
I was wondering that too. I did try to add the line y.focus(); but to no avail. Even though it doesn't stop the code running.
User avatar
jscher2000
Posts: 11762
Joined: December 19th, 2004, 12:26 am
Location: Silicon Valley, CA USA
Contact:

Post by jscher2000 »

texmex wrote:OK so it's not as dirty, but since I got in first, it wasn't as quick either ;-) :-)

:-D The problem with this project is, it's a bottomless pit. I decided I wanted to get images when there's no text...

Code: Select all

javascript:loc=location.href;x=document.getElementsByTagName(%22A%22);y=window.open();y.document.write(%22<html><head><title>Links!<title></head>\n<body><h3>Links from %22+loc+%22</h3>\n%22);for(n=0;n<x.length;n++){if(x[n].href!=""){if(x[n].text.replace(/\s+/,%22%22).length<1){for(j=0;j<x[n].childNodes.length;j++){if(x[n].childNodes[j].nodeName=="IMG"){y.document.write(%22<p>Image: <img src=\%22%22+x[n].childNodes[j].src+%22\%22 alt=\%22%22+x[n].childNodes[j].alt+%22\%22>%22); break;}}}else y.document.write(%22<p>Text: %22+x[n].text);y.document.write(%22<br>\nURL: %22+x[n].href+%22 <a href=\%22%22+x[n].href+%22\%22>=Go=&gt;</a></p>\n%22);}}y.document.write(%22</body></html>%22);y.document.close();void 0;

Thanks again for showing the way.
berwin
Posts: 3
Joined: April 5th, 2006, 11:02 pm

Post by berwin »

texmex, a thousand thanks to you, this works perfectly. Easy to paste into Excel and edit it there with two columns. Are you a genius? :-)

jscher2000, thanks for your effort and ideas.

I tried all three codes on a few web sites, and so far one website came back empty. Just a report, I am not complaining.

Is this forum great, or what?
User avatar
dickvl
Posts: 54161
Joined: July 18th, 2005, 3:25 am

Post by dickvl »

berwin wrote:I tried all three codes on a few web sites, and so far one website came back empty. Just a report, I am not complaining.

Could be a frame issue?
User avatar
the-edmeister
Posts: 32249
Joined: February 25th, 2003, 12:51 am
Location: Chicago, IL, USA

Post by the-edmeister »

jscher & texmex,

Didn't realize it could be done with a Bookmarklet.
Thanks for these Bookmarklets, I am adding them to my collection.

Ed
A mind is a terrible thing to waste. Mine has wandered off and I'm out looking for it.
tester123
Posts: 2
Joined: April 1st, 2008, 7:08 pm

need help on this script for using on japanese html pages

Post by tester123 »

I am trying this on Japanese html pages
page code has

(a href="http://xyz.net/index.html")(STRONG)高級 (/STRONG)食通(/a)

note: i have replaced the tag start and end <> signs with () since it was showing the effect of the above line insted of the line itself

the java script

"javascript:x=document.getElementsByTagName(%22A%22);y=window.open();y.document.write(%22<HTML><HEAD></HEAD><BODY><table>%22);for(n=0;n<x.length;n++){y.document.write(%22<tr><td>%22+x[n].text+%22</td><td>%22+x[n].href+%22</td></tr>%22);}y.document.write(%22</table></BODY></HTML>%22);y.document.close();void 0;"

i get the list as

食通 http://xyz.net/index.html

insted of

高級食通 http://xyz.net/index.html

pls note - i have faked the url and the words as i am not supposed to disclose but you can try this on any japanese page.

for quick try save the following as an html page and see

<html lang="ja-JP"> <head> <meta http-equiv="Content-type" content="text/html;

charset=Shift_JIS" /> <meta http-equiv="Content-Style-Type" content="text/css" />
<meta http-equiv="Content-Script-Type" content="text/javascript" />

<title>test</title>
<a href="http://xyz.net/index.html"><STRONG>高級 </STRONG>食通</a>
</html>


what makes this worst is i don't know japnese and java script.
i need to get the list of all the URLs and the respective anchor texts from many japanese html pages for verification and doing this manually has already become a nightmare!

pls help!

thanks,

- tstr
User avatar
jscher2000
Posts: 11762
Joined: December 19th, 2004, 12:26 am
Location: Silicon Valley, CA USA
Contact:

Post by jscher2000 »

Good point. The script looks for the direct .text child of the link, and if there are other tags in there, that text is missed.

To solve that, try this. Change x[n].text to x[n].textContent (which is the Firefox equivalent of IE's innerText). Does it work?

If you actually wanted the full HTML from inside the link, to preserve the exact appearance, you could in theory change it to x[n].innerHTML, but I can't recommend it. Unless you thoroughly cleanse the HTML, you might end up moving untrusted code into a trusted content and creating a security problem for yourself later.
tester123
Posts: 2
Joined: April 1st, 2008, 7:08 pm

Post by tester123 »

jscher,
sorry for late reply.
"textContent" has worked for me!. it is fetching correct text now.
cool!
thanks a Tons!
-tstr
Post Reply