On "Page Info > Links", saving Name and Address
17 posts
• Page 1 of 2 • 1, 2
When right-clicking on a page > View Page Info > Links, a window shows up with a description on the left side and its link on the right side. When there are many links on the page, many descriptions and their links are listed.
My question is: Is it possible to save both descriptions and their links? So far I have only been able through "Select All > Copy" and paste into Notepad or Excel to receive only the links, but never the descriptions of the links. Maximize the window and take a Screenshot.
Ed A mind is a terrible thing to waste. Mine has wandered off and I'm out looking for it.
Thanks, I know it would work, but on a page with hundreds of links the process would become cumbersome, and then the conversion to type from graphics, oh boy...
I guess there is not a simple method... Didn't Netscape 4.x have the option to print pages with a list of all the links at the end? That was handy.
With a little bit of DOM programming, it definitely should be possible to extract all the links, leaving just the question of how to format the results. When I do applications like this, I tend to use the internet controls supplied with IE, programmed from a VBA host like Microsoft Word (just 'cause I'm so familiar with it). But JavaScript should work, too. I'm not familiar with all the respective powers of bookmarklets, Greasemonkey scripts, and extensions, but at least one of them should have the privileges necessary to do it. Well here's a quick and dirty solution:
Copy the contents of this code to your clipboard Go into Bookmark manager and create a new bookmark Give it any name you like Paste my code into the "Location" box. and close. Now if you click that book mark, it will open a new window and paste in the links and their text strings as a table. You can then copy and paste the table.
Sorry it's so wide but it's imperitive that you have no spaces in this line of code. It only checks for Anchors tags (not other types of links) Hope this helps. Remind me to file a bug about the Page Info UI by the way. It's been in dire need of refactoring for years.
- Chris
Way cool. Here's an alternate version that doesn't use a table, skips the "name" type anchors that lack an href, and adds a little clickable link.
(The few spaces in there are correct as some phrases and tags do have spaces in them. For purity, you could replace them with %20 to create a truly correct URL.) I wonder why the new document appears behind the original one? OK so it's not as dirty, but since I got in first, it wasn't as quick either
![]() ![]() I chose to put it into a table as I noticed that berwin mentioned Excel. If you select all and copy my resultant window you can then goto Excel and do a Paste Special.. Text.. and it will all nicely split up into the columns. Wish I'd thought of removing the empty links though. I daresay a hybrid of the two solutions could be quite useful. I was wondering that too. I did try to add the line y.focus(); but to no avail. Even though it doesn't stop the code running.
![]()
Thanks again for showing the way. texmex, a thousand thanks to you, this works perfectly. Easy to paste into Excel and edit it there with two columns. Are you a genius? :-)
jscher2000, thanks for your effort and ideas. I tried all three codes on a few web sites, and so far one website came back empty. Just a report, I am not complaining. Is this forum great, or what?
Could be a frame issue? jscher & texmex,
Didn't realize it could be done with a Bookmarklet. Thanks for these Bookmarklets, I am adding them to my collection. Ed A mind is a terrible thing to waste. Mine has wandered off and I'm out looking for it.
I am trying this on Japanese html pages
page code has (a href="http://xyz.net/index.html")(STRONG)高級 (/STRONG)食通(/a) note: i have replaced the tag start and end <> signs with () since it was showing the effect of the above line insted of the line itself the java script "javascript:x=document.getElementsByTagName(%22A%22);y=window.open();y.document.write(%22<HTML><HEAD></HEAD><BODY><table>%22);for(n=0;n<x.length;n++){y.document.write(%22<tr><td>%22+x[n].text+%22</td><td>%22+x[n].href+%22</td></tr>%22);}y.document.write(%22</table></BODY></HTML>%22);y.document.close();void 0;" i get the list as 食通 http://xyz.net/index.html insted of 高級食通 http://xyz.net/index.html pls note - i have faked the url and the words as i am not supposed to disclose but you can try this on any japanese page. for quick try save the following as an html page and see <html lang="ja-JP"> <head> <meta http-equiv="Content-type" content="text/html; charset=Shift_JIS" /> <meta http-equiv="Content-Style-Type" content="text/css" /> <meta http-equiv="Content-Script-Type" content="text/javascript" /> <title>test</title> <a href="http://xyz.net/index.html"><STRONG>高級 </STRONG>食通</a> </html> what makes this worst is i don't know japnese and java script. i need to get the list of all the URLs and the respective anchor texts from many japanese html pages for verification and doing this manually has already become a nightmare! pls help! thanks, - tstr Good point. The script looks for the direct .text child of the link, and if there are other tags in there, that text is missed.
To solve that, try this. Change x[n].text to x[n].textContent (which is the Firefox equivalent of IE's innerText). Does it work? If you actually wanted the full HTML from inside the link, to preserve the exact appearance, you could in theory change it to x[n].innerHTML, but I can't recommend it. Unless you thoroughly cleanse the HTML, you might end up moving untrusted code into a trusted content and creating a security problem for yourself later. jscher,
sorry for late reply. "textContent" has worked for me!. it is fetching correct text now. cool! thanks a Tons! -tstr
17 posts
Page 1 of 2 • 1, 2
Who is onlineUsers browsing this forum: No registered users and 2 guests |
![]() |