Johns Hopkins UniversityThe Sheridan Libraries

Library Dean Wins American Library Association Award


Photo of a librarian helping students with research.
Staff DirectoryPersons With DisabilitiesContact UsSite MapHours

HomeLibrary ServicesOnline ResourcesCatalogsResearch HelpCollections

Ask a Librarian
How Do I...
Forms
My Account
About Us
Info For . . .
Giving
Search
LIBRARY BLOG

Spotlight

ACRL National Conference
March 29 - April 1, 2007

The Sheridan Libraries join the Baltimore library community in welcoming librarians and leaders in higher education to the 13th national Association of College and Research Libraries conference in Charm City.
National leaders will discuss issues such as interactive gaming and social networking technology in libraries.  MORE...


Archives 


  


Home > Research Help > General Research Help Topics > Evaluating Internet Information > Url Decoder


Understanding and decoding URLs


Uniform Resource Locators, or URLs, are the Internet addresses that you see on the Location bars at the top or bottom of your Web browser (e.g., Netscape or Internet Explorer). URLs provide a standard format for the transmission and reception of a wide variety of information types. Here is how they are constructed:

transfer protocol://servername.domain/directory/subdirectory/filename.filetype

Every URL must have at least the first two elements shown above (the information directly before and after the //). Here are some examples:

http://milton.mse.jhu.edu:8001/research/education/url.html
ftp://milton.mse.jhu.edu/pub/research.txt
gopher://milton.mse.jhu.edu/databases/

Understanding the different elements of URLs will help you know what to expect before you click on a link. Also you will be able to ascertain what kind of organization or institution the information is coming from. In some cases, you may be able to reconstruct someone's e-mail address from a URL.

The 1st part: Transfer protocol

The first part of the URL indicates what type of information is being transferred and, usually, what port (or "door") to the server is being accessed. Here are the most common types:

  • http: Hypertext (what you are viewing now: the standard format for the World Wide Web)
  • gopher: Gopher format (text only precursor of the Web: still good for text-based information)
  • ftp: File Transfer Protocol (Whoa! A computer file is about to be sent to your computer. Proceed with caution if this is new to you.)
  • news: Newsgroup format (something like a special interest bulletin board)
The 2nd part: Servername.domain

When you perform a simple yet elegant click of the mouse, your Web browser goes into high gear. It sends a message to a server, or computer where a Web site resides, asking it to send you information. The transfer protocol tells your computer and the server what formats of information need to be interpreted and with what particular features (e.g., I am using Netscape 4.7 or Internet Explorer 6.0). The servername.domain is the address of the server itself: your message has to go somewhere. Most server addresses have three parts:

  • actual name of the server (yes, they have feelings, too)
  • domain (the institution/organization/enterprise/whatever where the machine lives)
  • By country
    • for US and UK, domain type (educational, commercial, network, organizational, governmental, military)
    • A two-letter country code (not always applicable in the US)

Most servers have a name of some kind. It is a fallacy that all Web servers are called "www." Many are, but that is a simple matter of choice. Look at the URL of the document you are reading: you're looking at it on the Web, and there is no "www" in it's name. The domain is key to understanding where the information is coming from. Is it an educational institution, or is it a commercial service such as Prodigy? This is an important consideration when you are trying to evaluate an electronic document. Don't forget, the Internet is vanity publishing on its largest scale. This doesn't mean that documents coming from someone's personal commercial account are not valuable: it means that you have to apply your critical thinking skills.

  • View a list of two-letter country codes from Dave Price of Support-One. Nota bene: This document will tell you, correctly, that the US uses a two-letter code. This is used inconsistently and does not apply to all domain names in the U.S.
The 3rd part: Directories and subdirectories

Once you have been admitted to a server to get a document or "page", you need to know where you're going. Servers act just like your home computer: you keep your word-processing program in a separate directory from your modem software. In fact, your computer probably keeps the word processing documents in a separate subdirectory in the word-processing directory. Computers need to be neat and tidy in order to run efficiently. This is even more true when the computer is sending billions of bytes of information all over the world. The third part of the URL takes you directly to the directory and subdirectory where the page you want lives.

Now, there's a trick to some of this. Does the URL look like one of the following?

http://server.state.edu/~jsmith/mypage.html
http://www.company.com/users/jsmith/metoo.html
http://bigmachine.neighborhood.net:8001/people/jsmith/myturn.html

When you see a directory or subdirectory

  • that begins with a tilde (~) and looks a person's name;
  • follows a directory called "/users/" or "/people/" and looks like a person's name;


it's probably a Web page living on someone's personal Internet account. You can reconstruct their account name and address and send them an e-mail message. Here's what the addresses from the examples above would look like:

jsmith@server.state.edu
jsmith@company.com
(Notice that the "www." got dropped)
jsmith@bigmachine.neighborhood.net (Notice that the ":8001" got dropped)


It doesn't work every time, but it's still a good bet if you really need to get information from someone. And remember: if the address you reconstruct does not exist, your message will usually be returned to you, so you will know that the address is wrong.

The last part: Filename.filetype

The last part of the URL specifies the individual document you are looking at. If you go to the home page of any particular organization, there's a good chance that your URL doesn't include a file name. When you click on anything linking to that page, it probably will have one. Some standard file types include:

  • .html or .htm: hypertext (the standard for the Web)
  • .gif, .jpg, .bmp: image types (formats of visual images)
  • .zip, .tar: compressed files (proceed with caution: these are specially compressed files that will be downloaded onto your hard drive; you need to know if your computer can interpret them, and you need to get an "unzipping" utility)


STAFF DIRECTORY | PERSONS WITH DISABILITIES | CONTACT US | SITE MAP | HOURS

Sheridan Libraries
3400 North Charles Street, Baltimore, MD 21218
(410)-516-8335
Copyright 2004 | Disclaimer | Privacy Policy