MozillaZine

Nomenclature - absolute vs relative URIs

Discuss how to use and promote Web standards with the Mozilla Gecko engine.
Pim

User avatar
 
Posts: 2205
Joined: May 17th, 2004, 2:04 pm
Location: Netherlands

Post Posted August 12th, 2015, 12:31 am

Sorry to sound like a newbie, but when answering a question somewhere, I found that I don't have all the names in my head that I thought I had. And I thought I'd do a quick lookup, but the more pages I find, the more contradictory definitions I encounter!

URIs can consist of a scheme (the protocol like http:), domain (name of the website) and the directory tree and filename. Now if you have all of these, it's called an absolute path, and if you only have the filename (or something like ../filename), it's a relative path.

But here's my question: what do you call a path like this?

/directory/file

Some sources say it's a root path, others seem to indicate it's a relative path. The IETF is particularly confusing, saying that a relative URI can be an absolute path.

So now I'm not sure any more. Is /directory/file a relative path or not? Not to mention paths with a scheme but no domain (http:file). Which official source should I take as gospel?
Groetjes, Pim

trolly
Moderator

User avatar
 
Posts: 39896
Joined: August 22nd, 2005, 7:25 am

Post Posted August 12th, 2015, 3:13 am

To my knowledge it should be the root of the web server. At least for unix like OS you can restrict a process to a subtree of the real file system.
Think for yourself. Otherwise you have to believe what other people tell you.
A society based on individualism is an oxymoron. || Freedom is at first the freedom to starve.
Constitution says: One man, one vote. Supreme court says: One dollar, one vote.

jscher2000

User avatar
 
Posts: 10142
Joined: December 19th, 2004, 12:26 am
Location: Silicon Valley, CA USA

Post Posted August 12th, 2015, 1:40 pm

I don't know the strict definition, but in somepage.html I would only consider these kinds of URLs to be relative:

dir/otherpage.html
otherpage.html
../dir/otherpage.html
../otherpage.html

The reason is that in those cases, you cannot determine the correct path to otherpage.html unless you know the precise location of somepage.html on the server AND the path to otherpage.html changes if you change the location of somepage.html

All the other examples specify a path to otherpage.html that doesn't depend on the precise location of somepage.html AND does not change if you change the location of somepage.html on the server

Frenzie

User avatar
 
Posts: 2134
Joined: May 5th, 2004, 10:40 am
Location: Belgium

Post Posted August 13th, 2015, 3:57 am

Pim wrote:The IETF is particularly confusing, saying that a relative URI can be an absolute path.

What specifically are you talking about? Besides heading 10 about the <BASE> element, I can't say I see any such thing in the document.

I think your confusion stems from the difference between the words path and URL. /bla/blabla is an absolute path, but simultaneously a relative URL because it lacks a protocol and a domain name. That's always been my understanding and I think it's supported by 2.4.6.
Intelligent alien life does exist, otherwise they would have contacted us.

Olhanzilla

User avatar
 
Posts: 146
Joined: February 10th, 2005, 8:29 pm

Post Posted August 15th, 2015, 5:32 pm

Not knowing the exact context in which you are asking this question - here's a stab at answering it.

First, here is the difference between a URI and a URL https://en.wikipedia.org/wiki/Uniform_resource_identifier - by the way, that is an absolute URL

For the discussion here, we can treat them as the same thing - but bear in mind that technically they are different - but we can ignore that difference here --- at least I assume we can, based upon your question.

Due to the way this forum software handles text that looks like a URL, most of the examples I give below will appears as links - just look at the text it self and ignore the fact that they are displayed as links - such as in a different color and maybe underlined.

Say there is a page at on the host/domain http://www.bobzyz.com named myfile.html (I originally typed that name as http://www.zyx.com -- but there is actually a web site there and I wanted to use a host/domain name that does not exist)

The absolute URL to that file would be http://www.bobzyx.com/myfile.html.

If there is a link on that page to another page - say, otherpage.html - that link could be coded either as a absolute URL or a relative URL.

The obsolete URL is http://www.bobzyz.com/otherpage.html and the relative URL is /otherpage.html ---- relative to the page containing the link.

When we talk of relative or absolute URLs we are really talking about browser processing. All URLs sent by the browser to the server must be absolute URLs. The browser has to tell the server exactly where the web page or image file, etc.) resides.

It is the browser which handles things differently based upon whether a URL is relative or absolute.

Put your cursor over this URL http://www.bobzyx.com/myfile.html and look at the status bar (usually at the bottom of the browser window) and you will see the URL the browser will send to the server if you click that link. It is (or should be) http://www.bobzyx.com/myfile.html

Now put your cursor over this URL [url]/myfile.html[/url]and look at the URL in the status bar. It's the same as when you were over the absolute URL, above,

Why? It is because, as I said earlier, the browser must send the server the absolute URL of the page.

When the browser sees a link on a page, it first determines if the URL is an absolute URL or a relative URL. Basically, if the URL starts with a protocol, in this case HTTP, the browser assumes it is a absolute URL (there are several other criteria the browser uses to determine if a URL is an absolute or relative URL - but that really isn't important here)

If the browser finds an absolute URL it sends it, with no changes, to the server. If the browser detects a relative URL, it suffixes the URL with the protocol and host name (that combination is also called the Base URL)- in this case http:// and http://www.bobzyx.com - to make the absolute URL ---- http://www.bobzyx.com/myfile.html which it sends to the server.

Absolute means that the URL is complete, it is exactly what the server needs to get the page, image file, JavaScrtipt file, etc..

Relative tells the browser that it has to suffix the URL (as coded in the page) with the protocol and host name (with the Base URL) to construct the absolute URL to send to the server.

The only time you should use absolute URLs is when the page you're after does not exist on your web site.

Use relative URLs to refer to pages (or image files, etc) that are on your web site to avoid any problems if part of the URL changes - say you move the files to another directory on the server.

Let me through a curve ball at you - there is an HTML element named base. You can use that to specify what the browser appends to relative URLs. Read this page https://developer.mozilla.org/en-US/docs/Web/HTML/Element/base

Also, there is the matter of when a page of file exists in a directory which is at the same level as the page with the link.

For instance: the web site http:www.bobzyx.com has images in a separate directory than the pages that refer to them.

If the image files are in the directory /img/ and the pages are at the root, above, /img/ it gets just a tad difficult to create the correct relative URL,.

If the images are in a directory below the directory in which the pages reside, you would have to code it like this: img/imagefile.jpg

If the images were in the same directory as the pages, you'd code the URL as /imagefile.jpg

Things can get even more complicated if you have several layers of directories. You might find yo7urself having to code something line ../../js/file.js

I won't go into exactly what that would create because - thank god, if you are using a web page creator program to make your pages, you simply tell it where the image or such is when you create the link to it. The program will construct the relative URL accordingly.

That should help to confuse you even more :)

I hope this helps but I could be totally off track as to what your question really was.

By the way - the URL the browser actually sends to the server won't contain the host /domain name - such as http://www.bobzyx.com

Instead it has to send the IP address - the numeric address assigned to the server. It gets the IP address by sending the host/domain name to a DNS - Domain Name Server. The DNS looks up the host/domain name and returns the IP address for that name.

If you get a "Server not found error" it means that DNS could not find the host/domain name in its databases. There is not one DNS for the web, there are many. There is a process they use to update their databases with information from other DNSs. This is why if you move a web site to different hosting company - thus changing the IP address of a web site- it might take a while - maybe hours - for the change to propagate across all of the DNSs and get to the one your browser uses. Your site will be unavailable until the IP address is updated in all the DNSs around the world. Until it is updated, a DNS would return the old IP address and people would get errors, probably 404 File Not Found.

jscher2000

User avatar
 
Posts: 10142
Joined: December 19th, 2004, 12:26 am
Location: Silicon Valley, CA USA

Post Posted August 16th, 2015, 9:18 pm

Just a note on this part:

Olhanzilla wrote:By the way - the URL the browser actually sends to the server won't contain the host /domain name - such as http://www.bobzyx.com

Instead it has to send the IP address - the numeric address assigned to the server. It gets the IP address by sending the host/domain name to a DNS - Domain Name Server.

Firefox makes a connection using the IP address, but the request to the server does include the host name. The reason is shared hosting, where numerous virtual sites share a single IP address. If browsers did not send the host name, the servers would not know which site to serve. That is why the result of substituting the IP address for the host name in a URL is unpredictable.

mgagnonlv
 
Posts: 676
Joined: February 12th, 2005, 8:33 pm

Post Posted August 17th, 2015, 12:25 pm

Pim wrote:...URIs can consist of a scheme (the protocol like http:), domain (name of the website) and the directory tree and filename. Now if you have all of these, it's called an absolute path, and if you only have the filename (or something like ../filename), it's a relative path.

But here's my question: what do you call a path like this? /directory/file


I don't want to split hairs, but Dreamweaver and a few Content Management Systems define both as relative paths.
– relative to the document : image/image2.jpg or ../image/image2.jpg (go up once, then down once)
– relative to website root: /image/image2.jpg.

Absolute links should generally only be used for links to another website. Relative links should be used within a site. Which of the latter is best amounts to personal preference, but also to possible cut-and-paste between different code pages. If you plan to use a reference to a given page or image at various places within your website, then being relative to website root is easier to debug.
Michel Gagnon
Montréal (Québec, Canada)

Frenzie

User avatar
 
Posts: 2134
Joined: May 5th, 2004, 10:40 am
Location: Belgium

Post Posted August 18th, 2015, 6:34 am

mgagnonlv wrote:I don't want to split hairs, but Dreamweaver and a few Content Management Systems define both as relative paths.

Then it would seem that they're wrong. As noted in Appendix A to RFC 3986 — which upon paying closer attention I noticed obsoleted the RFC linked by Pim — uses the same terminology I already pointed out previously.

Code: Select all
   hier-part     = "//" authority path-abempty
                 / path-absolute
                 / path-rootless
                 / path-empty

   URI-reference = URI / relative-ref


Code: Select all
   path          = path-abempty    ; begins with "/" or is empty
                 / path-absolute   ; begins with "/" but not "//"
                 / path-noscheme   ; begins with a non-colon segment
                 / path-rootless   ; begins with a segment
                 / path-empty      ; zero characters


Note, however, that what is commonly referred to as a relative path is called a rootless path in the RFC.
Intelligent alien life does exist, otherwise they would have contacted us.

Dom1953
 
Posts: 52
Joined: July 24th, 2014, 6:02 am
Location: Australia

Post Posted August 18th, 2015, 6:57 pm

To recap, some confusion seems to be coming by from assuming that an Absolute URI and an Absolute Path mean the same thing. Or that a relative reference is the same as a relative path. In terms of the strict IETF nomenclature used in RFC3986 they are not.

"Absolute URI" (or sometimes just "URI") by definition start with a scheme name (e.g. "http:") Relative references are references (href attribute values) that do not start with a scheme. Too easy.

A relative reference that starts with a single "/" starts at the top or root of the hierarchical addressing scheme provided by the web site server. Although it takes the form of an an absolute path for a file system and is called an absolute path in the RFC, it remains a relative reference. It can't be resolved without the absolute URL of the host or absolute URL provided in a BASE HTML element.

Relative references that do not start with a "/" slash are "relative paths" resolved with respect to the HTML page location. As noted by Frenzie RFC3986 actually never uses the term "relative path". But I think it safe to say that the productions it does use, "path-rootless" and "path-noscheme", have not been adopted in everyday speech. "Relative path" works for me.

So technically /directory/file is a relative reference using an absolute path.

The other example http:file looks invalid: it does not meet the grammar of an Absolute URI (it would need an authority string starting with "//" after" http:") and is invalid as a relative reference (the first segment must not contain a colon).


But what I really wanted to comment on was the reserved domain names defined in RFC6761 like "test.", "invalid.", "localhost." and "example." Of these "example.com" (or .net or .org) are officially reserved for documentation and a good choice for fictitious addresses on this forum.

Greetings, Dom

BruceAWittmeier

User avatar
 
Posts: 2633
Joined: June 9th, 2008, 10:53 am
Location: Near 37.501685 -80.147967

Post Posted September 27th, 2015, 10:39 am

You may have resolved your problem by now, however...

The one syntax I didn't see is using the relative path similar to this: "/directory/file" would be "./directory/file" << with one dot prior to the first '/'. Two dots indicates to look up 1 path level then recurse as the path specifies. A single dot is the notation for the current directory as the starting point. I have seen in some cases where the missing first dot causes a fail in file location.
~ I'm only here to Pay it Forward. ~

"I often take a very long windy road to my destination. When I arrive I often wonder how I missed the shortcut".

Return to Web Development / Standards Evangelism


Who is online

Users browsing this forum: No registered users and 2 guests