Recent Changes - Search:

Network Programming

This website demonstrates using wikis as teaching and learning tool.

The course instructor is happy to share the teaching materials here with those who find it readable.

Tutorial - Java Network Programming - Web Client Example

Introduction

  • In this tutorial, you will look at a simple HTTP client implemented using a socket.
  • You will be provided with the source code and explanation of this example program.
  • Before you go to read this part, let’s have an idea of what an HTTP client basically does.

The HyperText Transfer Protocol

The HyperText Transfer Protocol (HTTP) is the set of rules for transferring files (for example, HTML documents, text, image, sound, video, and other multimedia files) between web clients and web servers. Relative to the TCP/IP suite of protocols, the HTTP is an application-layer protocol. The latest version of HTTP is HTTP 1.1.
HTTP communication can be divided into two separate parts, namely, HTTP requests and HTTP responses.
An HTTP request, which is made by a web client, contains information about a resource on the web server, and the action the client wishes the server to perform on the resource.
A web server fulfils an HTTP request by returning an HTTP response, which varies with the type of request and whether or not the request could be serviced.
To retrieve a web page from a certain HTTP server, an HTTP client simply sends a GET request to the server specifying a particular path and file name. The GET method is one type of requests of the HTTP application-level protocol. For example, the following HTTP request:
      GET /index.html HTTP/1.0
tells the server that the client wants to get the file /index.html and that this is an HTTP version 1.0 request. The server will reply by sending the HTTP response header fields, followed by the file to the client, or an error message if it cannot provide the requested file.
The telnet program, typically available in Windows and Linux, is a convenient tool for you to manually try out textual line-oriented TCP protocols, such as HTTP. In the following activity, you use the telnet program to access the web using the HTTP protocol.

Activity 1

  • In a command shell (i.e., DOS box), execute the telnet program to connect to the host <www.ouhk.edu.hk> at port 80:
      C:\> telnet www.ouhk.edu.hk 80
  • When the telnet program starts, the screen is cleared. Type in the following two lines:
      GET /index.html HTTP/1.0<CR>
      <CR>
  • In the 2 lines, <CR> represents the [Enter] key in the keyboard. That is, you shall press the [Enter] key twice consecutively. Observe the HTTP response from the web server.
  • The following shows a sample output of the telnet’s execution.
    HTTP/1.1 200 OK
    Date: Wed, 07 Oct 2009 09:02:44 GMT
    Server: Apache/1.3.26 (Unix) mod_jk/1.1.0 mod_ssl/2.8.9 OpenSSL/0.9.6b
    Last-Modified: Thu, 03 May 2007 02:07:01 GMT
    ETag: "420e2-1dd-463943c5"
    Accept-Ranges: bytes
    Content-Length: 477
    Connection: close
    Content-Type: text/html

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
    <HTML lang="zh">
    <HEAD>
    <title>The Open University of Hong Kong</title>
    <HEAD>
    <meta http-equiv="refresh" content="0;URL=http://www.ouhk.edu.hk/WCM/?FUELAP_TEM
    PLATENAME=tcSingPage&lang=eng">
    <META NAME="Author" CONTENT="The Open University of Hong Kong">
    <META NAME="Keywords" CONTENT="The Open University of Hong Kong">
    <META NAME="Description" CONTENT="This is a redirect page">
    </HEAD>
    . . . other content omitted . . .
HTTP GET transaction (Source: http://oreilly.com/openbook/webclient/ch03.html)

Activity 2

  • Now, let’s look at the source code of a Java program that connects to a web server using the Socket’s API.
import java.net.*;
import java.io.*;
import java.util.*;

public class WebClient {
    public static void main(String[] args) {
        Scanner scanner = new Scanner(System.in);
        System.out.print("Enter host name (e.g., www.ouhk.edu.hk): ");
        String host = scanner.nextLine();
        System.out.print("Enter page (e.g., /index.html): ");
        String page = scanner.nextLine();
        final String CRLF = "\r\n"; // newline
        final int PORT = 80; // default port for HTTP

        try {
            Socket socket = new Socket(host, PORT);
            OutputStream os = socket.getOutputStream();
            InputStream is = socket.getInputStream();
            PrintWriter writer = new PrintWriter(os);
            writer.print("GET " + page + " HTTP/1.0" + CRLF);
            writer.print(CRLF);
            writer.flush(); // flush any buffer
            BufferedReader reader = new BufferedReader(
                    new InputStreamReader(is));
            String line;
            while ((line = reader.readLine()) != null)
                System.out.println(line);
            socket.close();
        } catch (IOException ex) {
            ex.printStackTrace();
        }
    }
}
  • Compile and execute the program to see if you can download a webpage from the Internet.
  • Study the program with the following explanation.
The program first asks the user for the host name of the web server and the web page to be retrieved, and stores them in the host and page variables, respectively. Two constants, CRLF and PORT, are then defined for the end-of-line and default port number of HTTP.
The socket accessing code is enclosed in a try-catch block, due to the possible occurrence of IOException on network errors. In the try-catch block, a socket is created with the entered host name and the default port of 80. The output stream and input stream of the socket are obtained by calling the getOutputStream and getInputStream methods on the socket, respectively.
As HTTP is a text-based protocol, we wrap the binary OutputStream and InputStream into character-based streams for writing and reading lines of text. More specifically, a PrintWriter is created to wrap the OutputStream and two strings are written by using its print method. The two strings are a GET request line and an empty line. After that, the flush method is invoked on the PrintWriter to ensure that the output data in the buffer (of the Java implementation and/or the operating system) are actually sent down to the network.
With the HTTP request sent to the server, the program uses the BufferedReader class to obtain the input from the InputStream of the socket. The lines of input are read using the readLine method of BufferedReader in a while loop and printed to the screen, until readLine returns null which denotes the end of the input stream. At the end, we invoke the close method on the socket to free any resources used by it.
After studying the above example, you may realize how simple it is to build a TCP client using a Java socket and stream APIs. As we have mentioned before, the operations of a client program basically involves four steps. Now do the following activity to examine the APIs and see whether you can recognize these steps from the whole program source.

Activity 3

  • Study the Web client program and the Java API documentation to answer the following questions.
Question 1: In the source code of the WebClient program, identify the four essential steps in establishing a TCP client.
Question 2: Why do we need to flush the output stream of the socket (via the wrapping PrintWriter) after writing the HTTP request? You may try to comment out or remove the line of writer.flush(); to observe the program’s behaviour if no flushing was performed.
Question 3: What is the use of invoking the close method on the socket at the end of the program?
Question 4: The program obtains the input and output streams from the socket. It closes the socket explicitly but not the two streams. Is it needed to close the streams explicitly? You may refer to the API documentation of the Socket class.

Resources for you to learn more


Submission

  • Please submit your work to Steven by email.
Edit - History - Print - Recent Changes - Search
Page last modified on October 08, 2009, at 10:43 AM