On "Page Info > Links", saving Name and Address
-
- Posts: 3
- Joined: April 5th, 2006, 11:02 pm
On "Page Info > Links", saving Name and Address
When right-clicking on a page > View Page Info > Links, a window shows up with a description on the left side and its link on the right side. When there are many links on the page, many descriptions and their links are listed.
My question is: Is it possible to save both descriptions and their links?
So far I have only been able through "Select All > Copy" and paste into Notepad or Excel to receive only the links, but never the descriptions of the links.
My question is: Is it possible to save both descriptions and their links?
So far I have only been able through "Select All > Copy" and paste into Notepad or Excel to receive only the links, but never the descriptions of the links.
- the-edmeister
- Posts: 32249
- Joined: February 25th, 2003, 12:51 am
- Location: Chicago, IL, USA
- jscher2000
- Posts: 11772
- Joined: December 19th, 2004, 12:26 am
- Location: Silicon Valley, CA USA
- Contact:
Didn't Netscape 4.x have the option to print pages with a list of all the links at the end? That was handy.
With a little bit of DOM programming, it definitely should be possible to extract all the links, leaving just the question of how to format the results. When I do applications like this, I tend to use the internet controls supplied with IE, programmed from a VBA host like Microsoft Word (just 'cause I'm so familiar with it). But JavaScript should work, too. I'm not familiar with all the respective powers of bookmarklets, Greasemonkey scripts, and extensions, but at least one of them should have the privileges necessary to do it.
With a little bit of DOM programming, it definitely should be possible to extract all the links, leaving just the question of how to format the results. When I do applications like this, I tend to use the internet controls supplied with IE, programmed from a VBA host like Microsoft Word (just 'cause I'm so familiar with it). But JavaScript should work, too. I'm not familiar with all the respective powers of bookmarklets, Greasemonkey scripts, and extensions, but at least one of them should have the privileges necessary to do it.
-
- Posts: 403
- Joined: March 26th, 2006, 8:58 am
- Location: NE Lincs
Well here's a quick and dirty solution:
Copy the contents of this code to your clipboard
Go into Bookmark manager and create a new bookmark
Give it any name you like
Paste my code into the "Location" box.
and close.
Now if you click that book mark, it will open a new window and paste in the links and their text strings as a table. You can then copy and paste the table.
Sorry it's so wide but it's imperitive that you have no spaces in this line of code.
It only checks for Anchors tags (not other types of links)
Hope this helps.
Copy the contents of this code to your clipboard
Go into Bookmark manager and create a new bookmark
Give it any name you like
Paste my code into the "Location" box.
and close.
Now if you click that book mark, it will open a new window and paste in the links and their text strings as a table. You can then copy and paste the table.
Code: Select all
javascript:x=document.getElementsByTagName(%22A%22);y=window.open();y.document.write(%22<HTML><HEAD></HEAD><BODY><table>%22);for(n=0;n<x.length;n++){y.document.write(%22<tr><td>%22+x[n].text+%22</td><td>%22+x[n].href+%22</td></tr>%22);}y.document.write(%22</table></BODY></HTML>%22);y.document.close();void 0;
Sorry it's so wide but it's imperitive that you have no spaces in this line of code.
It only checks for Anchors tags (not other types of links)
Hope this helps.
- Thumper
- Posts: 8037
- Joined: November 4th, 2002, 5:42 pm
- Location: Linlithgow, Scotland
- Contact:
- jscher2000
- Posts: 11772
- Joined: December 19th, 2004, 12:26 am
- Location: Silicon Valley, CA USA
- Contact:
texmex wrote:Well here's a quick and dirty solution:
Way cool. Here's an alternate version that doesn't use a table, skips the "name" type anchors that lack an href, and adds a little clickable link.
Code: Select all
javascript:loc=location.href;x=document.getElementsByTagName(%22A%22);y=window.open();y.document.write(%22<html><head><title>Links!<title></head><body><h3>Links from %22+loc+%22</h3>%22);for(n=0;n<x.length;n++){if(x[n].href!="")y.document.write(%22<p>Text: %22+x[n].text+%22<br>\nURL: %22+x[n].href+%22 <a href=\%22%22+x[n].href+%22\%22>Go!</a></p>\n%22);}y.document.write(%22</body></html>%22);y.document.close();void 0;
(The few spaces in there are correct as some phrases and tags do have spaces in them. For purity, you could replace them with %20 to create a truly correct URL.)
I wonder why the new document appears behind the original one?
-
- Posts: 403
- Joined: March 26th, 2006, 8:58 am
- Location: NE Lincs
OK so it's not as dirty, but since I got in first, it wasn't as quick either
I chose to put it into a table as I noticed that berwin mentioned Excel. If you select all and copy my resultant window you can then goto Excel and do a Paste Special.. Text.. and it will all nicely split up into the columns. Wish I'd thought of removing the empty links though. I daresay a hybrid of the two solutions could be quite useful.
I chose to put it into a table as I noticed that berwin mentioned Excel. If you select all and copy my resultant window you can then goto Excel and do a Paste Special.. Text.. and it will all nicely split up into the columns. Wish I'd thought of removing the empty links though. I daresay a hybrid of the two solutions could be quite useful.
I was wondering that too. I did try to add the line y.focus(); but to no avail. Even though it doesn't stop the code running.jscher2000 wrote:I wonder why the new document appears behind the original one?
- jscher2000
- Posts: 11772
- Joined: December 19th, 2004, 12:26 am
- Location: Silicon Valley, CA USA
- Contact:
texmex wrote:OK so it's not as dirty, but since I got in first, it wasn't as quick either
The problem with this project is, it's a bottomless pit. I decided I wanted to get images when there's no text...
Code: Select all
javascript:loc=location.href;x=document.getElementsByTagName(%22A%22);y=window.open();y.document.write(%22<html><head><title>Links!<title></head>\n<body><h3>Links from %22+loc+%22</h3>\n%22);for(n=0;n<x.length;n++){if(x[n].href!=""){if(x[n].text.replace(/\s+/,%22%22).length<1){for(j=0;j<x[n].childNodes.length;j++){if(x[n].childNodes[j].nodeName=="IMG"){y.document.write(%22<p>Image: <img src=\%22%22+x[n].childNodes[j].src+%22\%22 alt=\%22%22+x[n].childNodes[j].alt+%22\%22>%22); break;}}}else y.document.write(%22<p>Text: %22+x[n].text);y.document.write(%22<br>\nURL: %22+x[n].href+%22 <a href=\%22%22+x[n].href+%22\%22>=Go=></a></p>\n%22);}}y.document.write(%22</body></html>%22);y.document.close();void 0;
Thanks again for showing the way.
-
- Posts: 3
- Joined: April 5th, 2006, 11:02 pm
texmex, a thousand thanks to you, this works perfectly. Easy to paste into Excel and edit it there with two columns. Are you a genius? :-)
jscher2000, thanks for your effort and ideas.
I tried all three codes on a few web sites, and so far one website came back empty. Just a report, I am not complaining.
Is this forum great, or what?
jscher2000, thanks for your effort and ideas.
I tried all three codes on a few web sites, and so far one website came back empty. Just a report, I am not complaining.
Is this forum great, or what?
- the-edmeister
- Posts: 32249
- Joined: February 25th, 2003, 12:51 am
- Location: Chicago, IL, USA
-
- Posts: 2
- Joined: April 1st, 2008, 7:08 pm
need help on this script for using on japanese html pages
I am trying this on Japanese html pages
page code has
(a href="http://xyz.net/index.html")(STRONG)高級 (/STRONG)食通(/a)
note: i have replaced the tag start and end <> signs with () since it was showing the effect of the above line insted of the line itself
the java script
"javascript:x=document.getElementsByTagName(%22A%22);y=window.open();y.document.write(%22<HTML><HEAD></HEAD><BODY><table>%22);for(n=0;n<x.length;n++){y.document.write(%22<tr><td>%22+x[n].text+%22</td><td>%22+x[n].href+%22</td></tr>%22);}y.document.write(%22</table></BODY></HTML>%22);y.document.close();void 0;"
i get the list as
食通 http://xyz.net/index.html
insted of
高級食通 http://xyz.net/index.html
pls note - i have faked the url and the words as i am not supposed to disclose but you can try this on any japanese page.
for quick try save the following as an html page and see
<html lang="ja-JP"> <head> <meta http-equiv="Content-type" content="text/html;
charset=Shift_JIS" /> <meta http-equiv="Content-Style-Type" content="text/css" />
<meta http-equiv="Content-Script-Type" content="text/javascript" />
<title>test</title>
<a href="http://xyz.net/index.html"><STRONG>高級 </STRONG>食通</a>
</html>
what makes this worst is i don't know japnese and java script.
i need to get the list of all the URLs and the respective anchor texts from many japanese html pages for verification and doing this manually has already become a nightmare!
pls help!
thanks,
- tstr
page code has
(a href="http://xyz.net/index.html")(STRONG)高級 (/STRONG)食通(/a)
note: i have replaced the tag start and end <> signs with () since it was showing the effect of the above line insted of the line itself
the java script
"javascript:x=document.getElementsByTagName(%22A%22);y=window.open();y.document.write(%22<HTML><HEAD></HEAD><BODY><table>%22);for(n=0;n<x.length;n++){y.document.write(%22<tr><td>%22+x[n].text+%22</td><td>%22+x[n].href+%22</td></tr>%22);}y.document.write(%22</table></BODY></HTML>%22);y.document.close();void 0;"
i get the list as
食通 http://xyz.net/index.html
insted of
高級食通 http://xyz.net/index.html
pls note - i have faked the url and the words as i am not supposed to disclose but you can try this on any japanese page.
for quick try save the following as an html page and see
<html lang="ja-JP"> <head> <meta http-equiv="Content-type" content="text/html;
charset=Shift_JIS" /> <meta http-equiv="Content-Style-Type" content="text/css" />
<meta http-equiv="Content-Script-Type" content="text/javascript" />
<title>test</title>
<a href="http://xyz.net/index.html"><STRONG>高級 </STRONG>食通</a>
</html>
what makes this worst is i don't know japnese and java script.
i need to get the list of all the URLs and the respective anchor texts from many japanese html pages for verification and doing this manually has already become a nightmare!
pls help!
thanks,
- tstr
- jscher2000
- Posts: 11772
- Joined: December 19th, 2004, 12:26 am
- Location: Silicon Valley, CA USA
- Contact:
Good point. The script looks for the direct .text child of the link, and if there are other tags in there, that text is missed.
To solve that, try this. Change x[n].text to x[n].textContent (which is the Firefox equivalent of IE's innerText). Does it work?
If you actually wanted the full HTML from inside the link, to preserve the exact appearance, you could in theory change it to x[n].innerHTML, but I can't recommend it. Unless you thoroughly cleanse the HTML, you might end up moving untrusted code into a trusted content and creating a security problem for yourself later.
To solve that, try this. Change x[n].text to x[n].textContent (which is the Firefox equivalent of IE's innerText). Does it work?
If you actually wanted the full HTML from inside the link, to preserve the exact appearance, you could in theory change it to x[n].innerHTML, but I can't recommend it. Unless you thoroughly cleanse the HTML, you might end up moving untrusted code into a trusted content and creating a security problem for yourself later.