C++ Web Programming

What is CGI?

  • Common Gateway Interface (CGI) is a set of standards that defines how information is exchanged between a web server and a client script.
  • The CGI specification is currently maintained by NCSA, which defines CGI as follows:
  • Common Gateway Interface (CGI), an interface standard for interfacing external gateway programs with information servers such as HTTP servers.
  • The current version is CGI/1.1, and the CGI/1.2 version is moving forward.

Web browsing

To better understand the concept of CGI, let's click on a hyperlink and browse to a specific web page or URL to see what happens.

  • Your browser contacts the HTTP web server and requests the URL, which is the file name.
  • The web server will parse the URL and look up the file name. If the requested file is found, the web server will send the file back to the browser, otherwise an error message will be sent indicating that you requested an incorrect file.
  • The web browser gets the response from the web server and displays the file or error message based on the response received.

However, the HTTP server built in this way, whenever a file in the directory is requested, the HTTP server sends back not the file, but executes it as a program and sends the output generated by the execution back. The browser is displayed.

Common Gateway Interface (CGI) is a standard protocol that enables applications (called CGI programs or CGI scripts) to interact with web servers and clients. These CGI programs can be written in Python, PERL, Shell, C, or C++.

CGI architecture diagram

The following image illustrates the architecture of CGI:

Web Server Configuration

Before you do CGI programming, make sure your web server supports CGI and is configured to handle CGI programs. All CGI programs executed by the HTTP server must be in a pre-configured directory. This directory is called the CGI directory and is named by convention as /var/www/cgi-bin. Although a CGI file is a C++ executable, its extension is .cgi by convention.

By default, the Apache web server is configured to run CGI programs in /var/www/cgi-bin. If you want to specify a different directory to run CGI scripts, you can modify the following sections in the httpd.conf file:

<Directory "/var/www/cgi-bin">
   AllowOverride None
   Options ExecCGI
   Order allow,deny
   Allow from all
<Directory "/var/www/cgi-bin">
Options All

Here, we assume that the web server is configured and running successfully, you can run any CGI program, such as Perl or Shell.

First CGI program

See the C++ program below:


#include <iostream> using namespace std; int main () { cout << "Content-type:text/html\r\n\r\n"; cout << "<html>\n"; cout << "<head>\n"; cout << "<title>Hello World - First CGI program </title>\n"; cout << "</head>\n"; cout << "<body>\n"; cout << "<h2>Hello World! This is my first CGI program</h2>\n"; cout << "</body>\n"; cout << "</html>\n"; return 0; }

Compile the above code, name the executable file cplusplus.cgi, and save the file in the /var/www/cgi-bin directory. Before running the CGI program, use the chmod 755 cplusplus.cgi UNIX command to modify the file mode to ensure the file is executable. Access the executable and you will see the following output:

Hello World! This is my first CGI program

The above C++ program is a simple program that writes its output to the STDOUT file, which is displayed on the screen. Here, it's worth noting that the first line outputs Content-type:text/html\r\n\r\n. This line is sent back to the browser and specifies the type of content to display on the browser window. You must understand the basic concepts of CGI so that you can further write more complex CGI programs in Python. C++ CGI programs can interact with any other external system, such as an RDBMS.

HTTP header information

Line Content-type:text/html\r\n\r\n is part of the HTTP header information that is sent to the browser for a better understanding of the page content. The HTTP header information has the following form:

HTTP Field Name: /span>Field Content
for example
Content-type: text/html\r\n\r\n

There are some other important HTTP headers that are often used in your CGI programming.

Head Information Description
Content-type: MIME string that defines the file format returned. For example, Content-type: text/html.
Expires: Date The date the message became invalid. The browser uses it to determine when a page needs to be refreshed. A valid date string should be in the format 01 Jan 1998 12:00:00 GMT.
Location: URL thisURL Is the URL that should be returned, not the requested URL. You can use it to redirect a request to an arbitrary file.
Last-modified: Date The last modified date of the resource.
Content-length: N The length of the data to be returned, in bytes. The browser uses this value to indicate the estimated download time of a file.
Set-Cookie: String Set the cookie with string.

CGI environment variable

All CGI programs have access to the following environment variables. These variables play a very important role in writing CGI programs.

variable name Description
CONTENT_TYPE The data type of the content. Used when the client sends additional content to the server. For example, file uploading and other functions.
CONTENT_LENGTH The length of the information queried. Only available for POST requests.
HTTP_COOKIE with key & The form of the value pair returns the set cookies.
HTTP_USER_AGENT User agent request header field, submit information about the user's request, including the browser's name, version and other platform-specific additional information.
PATH_INFO The path to the CGI script.
QUERY_STRING URL encoding information when the request is sent via the GET method, including the parameters following the question mark in the URL.
REMOTE_ADDR The IP address of the remote host making the request. This is very useful when logging and authenticating.
REMOTE_HOST The fully qualified name of the host making the request. If this information is not available, you can use REMOTE_ADDR to get the IP address.
REQUEST_METHOD The method used to make the request. The most common methods are GET and POST.
SCRIPT_FILENAME The full path to the CGI script.
SCRIPT_NAME The name of the CGI script.
SERVER_NAME The host name or IP address of the server.
SERVER_SOFTWARE The name and version of the software running on the server.

The following CGI program lists all CGI variables.


#include <iostream> #include <stdlib.h> #include <string> using namespace std; const string ENV[ 24 ] = { "COMSPEC", "DOCUMENT_ROOT", "GATEWAY_INTERFACE", "HTTP_ACCEPT", "HTTP_ACCEPT_ENCODING", "HTTP_ACCEPT_LANGUAGE", "HTTP_CONNECTION", "HTTP_HOST", "HTTP_USER_AGENT", "PATH", "QUERY_STRING", "REMOTE_ADDR", "REMOTE_PORT", "REQUEST_METHOD", "REQUEST_URI", "SCRIPT_FILENAME", "SCRIPT_NAME", "SERVER_ADDR", "SERVER_ADMIN", "SERVER_NAME","SERVER_PORT","SERVER_PROTOCOL", "SERVER_SIGNATURE","SERVER_SOFTWARE" }; int main () { cout << "Content-type:text/html\r\n\r\n"; cout << "<html>\n"; cout << "<head>\n"; cout << "<title>CGI Environmental variable </title>\n"; cout << "</head>\n"; cout << "<body>\n"; cout << "<table border = \"0\" cellspacing = \"2\">"; for ( int i = 0; i < 24; i++ ) { cout << "<tr><td>" << ENV[ i ] << "</td><td>"; // Try to retrieve the value of an environment variable char *value = getenv( ENV[ i ].c_str() ); if ( value != 0 ){ cout << value; }else{ cout << "The environment variable does not exist."; } cout << "</td></tr>\n"; } cout << "</table><\n"; cout << "</body>\n"; cout << "</html>\n"; return 0; }