As you go through this chapter, know that you will probably start out creating web services in a manner that is both wrong and right.
Written by Kevin Schroeder
Editor's note: This article is an excerpt from Chapter 8, "Web Service Basics," of Advanced Guide to PHP on IBM i.
By definition, a web service is a remote procedure call that is made over the web. The web is run using HTTP. So for a remote procedure call to be called a web service, it needs to be called over HTTP or some derivative thereof. Although several protocols are used to pass information back and forth between machines, if this exchange is done over HTTP, then it is technically a web service.
No web service standard exists that we need to follow for something to qualify as being a web service. This is both a benefit and a drawback. It is a benefit because getting a web service up and running can be quick and easy to do. The drawback is that the lack of a standard and real constraints makes the world of web services incredibly fragmented. What is "cool" is often what wins out, and it is difficult to find a pattern of best practices because everyone is doing things differently.
So, as you go through this chapter, know that you will probably start out creating web services in a manner that is both wrong and right. But understand that it matters less than you think unless you are using a standard such as Simple Object Access Protocol (SOAP), which is very defined, rigid, and verbose.
REST
We will start our examination with Representational State Transfer, or REST. There is lots of conversation about what is and is not REST, most of which is largely futile. On a basic level, REST means do X for Y via resource Z. But where the confusion sometimes lies is in how X is done and how Y is represented and what Z looks like. And the confusion is made worse by those who assert that certain REST services that claim to be RESTful are not truly RESTful and give a litany of reasons that rival the 95 Theses, when in reality they just need to take themselves a little less seriously. Most services are not truly RESTful, but it does not really matter.
In actuality, most services are REST-like and not RESTful. This is probably because people were using REST-like systems well before REST came along, but REST provided an opportunity for a communication type to coalesce around a standard. As is true with almost every standard, however, nobody follows it perfectly. Perhaps the controversy stems from the fact that nobody wants his or her PhD dissertation to be misrepresented, but reading the dissertation itself does not render a clearly defined standard. Comparing the debate about REST to a "strongly" standards-based approach such as a Request for Comment (RFC) document or W3C standard will demonstrate why there is such confusion around REST. REST is not a highly structured protocol but a series of practices or, rather, an architectural style.
As you read through this chapter, know that it would probably throw the REST author in a loop. However, my concern in writing this is to explain the benefits of a REST-like approach as opposed to ensuring RESTful compliance. As with many things, value is in knowing what to use, what to ignore, and how to balance both to optimally implement a given circumstance. As such, I will be highlighting important points to note. If you intend to build a fully RESTful interface, several online and print resources are available that expound upon the subject in fuller detail.
Basics
The definition of REST might seem somewhat restrictive because REST is intended to solve web-based problems in a web-based style. For an API that serves a few hundred requests daily, many of these limitations might seem overkill, but the design of REST is intended to allow for several processes to occur.
A REST-based architectural style has several constraints, but we will focus on just a few of them. The first is that the system is intended to be scalable. A REST-like system should be stateless. Stateless protocols are much easier to scale than those that are required to maintain state. As a web developer, you will undoubtedly be familiar with the world's most popular stateless protocol: HTTP. Because HTTP is stateless, each request is isolated from other requests and has no knowledge of what happened before, making responses easily cacheable. This is a significant component of designing a REST-like architecture.
Caching can be done either on the client or on an intermediary in between the client and the server. As such, an API can use a layered system with many different caches in between, and the client will be none the wiser. Generally, caching is managed via HTTP caching headers to inform any intermediary servers (or the client) what the caching parameters are. However, because each URL is intended to be a location to a representation of a resource, the content can (and should) be cached on the server side as well.
Resource Definitions
Much of REST's magic is managed in the way it handles resource endpoints. The tactic was not new or novel, but it unified the approach being used. A REST-based scenario has two basic types of URLs: collection-based and entity-based.
When building a typical API, developers often build out the API by using endpoints with verbs; for example, http://localhost/api/users/getUsers (we will be doing something like this later). But this is not the REST way. REST is intended to be representational. In other words, it is supposed to represent the data in your organization, not describe what that data is doing.
So for the user example, rather than having a verb in your API call, you have a resource location. In this case, it is http://localhost/api/users. When that URL is queried, the client will receive a list of user resources that will include a URL.
Here is where you might get the purists' panties in a bunch. How do you return the results of the query for the user collection? You might think that returning a list of defined users is what you should do. And you would be wrong. Instead, for your API to be properly RESTful, you should return a list of user endpoint URLs. You would then query those individual URLs to retrieve the user data for each. That is what you need for the API to be RESTful.
To retrieve an individual entity, you put its unique identifier at the end of the URL: http://localhost/api/users/1. Each entity will have a URL where its data payload can be downloaded from, and only from there.
But isn't that kind of wasteful? "I could be making hundreds, or thousands, of queries against an API to get each of the entities I need," you might say. And you would be right. This is exactly why many people use the REST characteristics without being truly RESTful. Being truly RESTful can be a giant pain in the butt. So take some shortcuts, if you must.
However, by taking shortcuts, you will lose some of the benefits, namely caching. If you retrieve a user collection that returns entities, can you cache the results? Well, yes. But what happens when someone tries to retrieve the entity via its unique URL? And then what happens when that entity has changed? Now you need to know all the places where that entity has been cached and invalidate them all. Doing so is somewhat easy if you are caching internally in PHP. But what happens if you are using Varnish or some kind of multitier caching system? Then it becomes much more problematic.
Usage of HTTP Verbs
Did you know that HTTP has more method actions than just GET and POST? RFC 2616 defines eight (GET, POST, HEAD, OPTIONS, CONNECT, PUT, DELETE, and TRACE), and RFC 5789 adds another one (PATCH) that REST can exploit. A REST-like API does not use verbs in URLs, so you determine the type of action to do via the HTTP method.
Wikipedia has a great chart (Table 8.1) that helps put this into perspective. (You can find the chart, along with other information about REST, at http://en.wikipedia.org/wiki/representational_state_transfer.)
Table 8.1: Relationship of REST endpoint types and HTTP methods
Resource |
GET |
PUT |
POST |
DELETE |
Collection URI |
List the URIs and perhaps other details of the collection's members. |
Replace the entire collection with another collection. |
Create a new entry in the collection. The new entry's URI is assigned automatically and is usually returned by the operation. |
Delete the entire collection. |
Element URI |
Retrieve a representation of the addressed member of the collection, expressed in an appropriate Internet media type. |
Replace the addressed member of the collection, or if it does not exist, create it. |
Not generally used. Treat the addressed member as a collection in its own right and create a new entry in it. |
Delete the addressed member of the collection. |
So to delete an individual user, you would use an HTTP request like this:
DELETE /api/users/1 HTTP/1.0
To update the same user, use this:
PUT /api/users/1 HTTP/1.0
Content-Type: application/json
Content-Length: 68
<user>
<name>Kevin</name>
<email>
</user>
Or to delete all of the users, code the following:
DELETE /api/users HTTP/1.0
Considering that Roy Fielding co-authored the HTTP protocol, it makes perfect sense that he would use it when defining REST. REST does not specifically require HTTP, but it is a protocol that fits REST's needs for statelessness and easy caching better than most others. As such, the HTTP methods correspond nicely to the verbs that REST requires.
Authentication
REST does not have a specific method of authentication as part of the standard, so you are free to use whichever authentication method you would like. Given that inherent within the REST definition is the ability to delete an entire collection, you should implement some level of authentication and access control as part of your API. But because you would do that separately from REST, though on top of HTTP, we will not cover it here but later in the chapter (with a more general discussion on authentication for APIs).
SOAP
Ah, SOAP. The only protocol that makes you feel dirty after using it. SOAP is one of the most bloated, complicated protocols you will ever see, but it's one of the most sophisticated. For this reason, many developers have rebelled against SOAP and opted for a REST-like approach. Unlike REST, which is an architectural style as opposed to a protocol, SOAP is a protocol that has stringent requirements and expectations from the clients that it interacts with. SOAP's stringent data requirements are also a benefit. Because SOAP is a complex protocol, we will take only a basic look at both the protocol and the PHP implementation.
You implement SOAP through the use of an XML document that is generated on the server and then returned to the client. Unlike with REST or many other web service implementations, the structure of the document is extremely important. Additionally, unlike REST, in SOAP the URL is largely unimportant because the remote procedure call is defined in the XML document itself.
The base XML document is based on the SOAP schema and must use SOAP namespaces and encoding in the document. You can find the XML Schema Definition (XSD) for the basic SOAP request structure at http://www.w3.org/2001/12/soap-envelope and the encoding at http://www.w3.org/TR/2000/NOTE-SOAP-20000508/encoding-2000-04-18.xml. The root element of the document is soap:Envelope, which has at least two elements: soap:Header and soap:Body.
The header does not play as prominent a role as the body does. The header is not required, but the body is. It does not have the same level of structured requirements as the body. However, you can define the header requirements in a Web Services Definition Language file, or WSDL (which we will examine later), and make it a prerequisite for a properly constructed request.
But for now, we will look at WSDL-less requests for simplicity:
$client = new SoapClient (
null,
array (
'location' => 'http://localhost/soap/server.php',
'uri' => 'http://test-uri/'
)
);
echo $client->getServerDate();
This code is setting up a WSDL-less SOAP request. The first parameter in the SoapClient constructor is the URL of the WSDL to use. Because this example does not have a WSDL, you set that value as null. But by doing that, you must provide the endpoint for the SoapClient to call, which you do in an array of options in the second parameter. That value is specified in the location key. But what about that uri key? You use that key to specify the XML namespace for the actual SOAP call XML structure.
The method that you call, in this case doCall(), does not exist in the SoapClient class. It is handled via __call(), which lets you act upon the SoapClient object as if it were the remote object itself.
When you execute the code, you get the following XML document (reformatted for clarity):
<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:ns1="http://test-uri/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"
SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<SOAP-ENV:Body>
<ns1:getServerDate/>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
To handle the request, you create an endpoint that instantiates the SoapServer class. You then specify the response URI as well as a list of functions that will be responsible for handling any requests that are made.
Following is a basic SOAP server implementation that will handle the request you previously defined:
function getServerDate()
{
return date('r');
}
$server = new SoapServer(
null,
array(
'uri' => ''http://test-uri/'>
)
);
$server->addFunction('getServerDate');
$server->handle();
Like SoapClient, the first parameter is a WSDL. However, you are currently running in WSDL-less mode, so you define the uri as one of the options. The URI provides the namespace for the body response. When using functions instead of classes, you must individually include all the functions that will be used to respond to the client.
The server generates a response similar to this (again, formatted for clarity):
<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:ns1="http://test-uri/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<SOAP-ENV:Body>
<ns1:getServerDateResponse>
<return xsi:type="xsd:string">Wed, 10 Jul 2013 07:24:45 -0500</return>
</ns1:getServerDateResponse>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
Note that the response is defined in a *Response node under the body node. This is largely uninteresting, however, because it happens behind the scenes. But if you need to see what is occurring, you can use a little-used setting on the client:
$client = new SoapClient (
null,
array (
'location' => 'http://localhost/soap/server.php',
'uri' => 'http://test-uri/',
'trace' => 1
)
);
This setting lets you use another function called getLastResponse(), which will output the XML document that was received from the server. This is useful for debugging.
echo htmlspecialchars($client->__getLastResponse());
Although functions are easier to implement for smaller web services, they also become more complicated as the API requirements grow. You can add a simple class by defining the class and then using the setClass() method on the SoapServer instance:
class SoapEndpoint {
public function getServerDate()
{
return date('r');
}
}
$server = new SoapServer(
null,
array(
'uri' => 'http://test-uri/'>
)
);
$server->setClass('SoapEndpoint');
$server->handle();
If you were to run your client code against this server, the code would operate exactly as it previously did. You could continue this track, but it would result in increasing complexity to the point of nonsense. You can do much to configure the SOAP objects programmatically, but one of the benefits of SOAP is that it is supposed to document and manage a lot of this on its own.
LATEST COMMENTS
MC Press Online