Lassi Haaranen, edited by Petri Ihantola
licensed under a Creative Commons Attribution 4.0 International license.
Not all web applications are meant directly for people. Different web services provide useful information primarily aimed for other applications to consume. Some examples are things like authentication, stock data, weather reports, Twitter feeds, etc.
These programmatic interfaces might not have a GUI but instead are meant to be used by other programs on client-side as well as on server-side.
Typically they are created in a way that allows the consumers to use different operating systems or languages.
Most commonly they transfer the data through HTTP
.
Three of commonly used options for web services:
SOAP
Simple Object Access ProtocolXML-RPC
XML (encoded) Remote Procedure CallREST
Representational State Transfer (probably most common nowadays?)Simple Object Access Protocol (SOAP) is an XML-based messaging protocol where client sends requests encoded in XML and gets reponses back as XML. It can use other protocols in addition to HTTP, such as SMTP
.
Web Services Description Language (WSDL
) is XML-based language used to describe the data/functionality offered by a web service. It's commonly used with SOAP.
You should not need to write SOAP yourself, let the API/Toolkit handle that for you.
A call in some system like this:
...getFirstName("tteekkar")
Would be expressed in SOAP like this:
<SOAP-ENV:Envelope
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
SOAP-ENV:encodingStyle=
"http://schemas.xmlsoap.org/soap/encoding/">
<SOAP-ENV:Body>
<mtd:GetFirstName xmlns:mtd="an URI"
<username xsi:type="xsd:string">
tteekkar
</username>
</mtd:GetFirstName>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
Remote Procedure Call (RPC
) generally refers to collections of techniques to call a procedure typically on another computer. XML-RPC is a specification that is SOAP's predecessor.
Procedure calls, parameters and responses are encoded in XML. It's simpler than SOAP and doesn't have a WSDL.
In RPCs the focus is on method. In Representational State Transfer (REST
) the focus is on resources that can be accessed and acted upon in a standard way.
Roy Fielding described REST in his dissertation 'Architectural Styles and the Design of Network-based Software Architectures'.
Anything that is important enough to be separately referenced/ modified/retrieved is a resource that has a reference. Universal Resource Identifier (URI
) is the reference. They should be constructed in an intuitive (human readable) manner and reflecting the underlying data structure.
The web is full of different APIs, a good site to start looking for what is out there is programmableweb.com.
As an example New York Times' Most shared sports items from past 7 days.
Resources are any type of data that can be pointed to with a hyperlink: images, documents, web services, etc.
Data is encoded in a way that it can represent the state of the resource and resources are referenced by URIs.
REST is a pattern, a way of doing things that is not formally codified.
The REST portion of this lecture is influenced by Beautiful REST & JSON APIs by Les Hazlewood (definitely worth watching if you need to create an API).
Scalability, Generality, Independence, Latency (caching), Security, Encapsulation, Separation of concerns, Statelessness, Uniform interface
REST APIs often use JSON since it offers many benefits:
However, XML/plaintext/html etc. is also used.
In RPC, resources can be thought of as verbs. In REST, resources should be nouns.
Broadly speaking there are two types of resources:
Collection resource
e.g. /blogposts/2014
These link and list multiple instance resources. And they might also have their own attributes (e.g. 'first', 'latest')Instance resource
e.g. /blogposts/rant-about-rest
These resources describe one instance of somethingExample of an instance resource:
// from: http://some-library-service/isbn/978-0596517748
{
"title": "JavaScript: The Good Parts",
"author": "Douglas Crockford",
"published": "2008",
"isbn": "978-0596517748"
"href": "http://some-library-service/isbn/978-0596517748"
}
// from: http:/some-library-service/keyword/javascript
{
"keywords": ["javascript"],
"books": [
{
"href": "http://some-library-service/isbn/978-0596517748",
"isbn": "978-0596517748"
}, {
"href": "http://some-library-service/isbn/978-1933988696",
"isbn": "978-1933988696"
},
...
]
}
Any individual request should be understandable without knowledge of any preceding requests.
This also means that requests can be interpreted and acted upon individually, which makes parallel processing easier.
All required data needs to be present in the request itself.
Addressability (every resource has a unique identifier) gives the major benefit of using resources in unforeseen ways.
Related to this it is a good idea to version your APIs, for example:
http://my-cool-site.com/api/v1/some/stuff
The future is unknown, and you might need to change the API for one reason or another. Versioning enables you to leave the old behavior in place so those programs that depend on that version will not be broken.
The idea of connectedness is that resources link to other resources
Following links traverses the API uncovering more information and structure.
All the resources share a common uniform interface. Instead of doing something (URLs as a "method") set something (URLs as "variables").
Four main constraints of uniform interface:
Take advantage of HTTP
"verbs":
GET
fetch resourcePUT
create/replace resourceDELETE
remove resourcePOST
add/modify resourcePOST is not idempotent, so it can be used for partial updates.
PUT and DELETE request can be initiated via Javascript (AJAX), but not via HTML form submission.
Wikipedia has a nice article about REST.
Hypertext As The Engine Of Application State (HATEOAS
).
"Clients make state transitions only through actions that are dynamically identified within hypermedia by the server (e.g., by hyperlinks within hypertext). Except for simple fixed entry points to the application, a client does not assume that any particular action is available for any particular resources beyond those described in representations previously received from the server." Wikipedia - REST (emphasis mine)
Still not widely used
There is no standard way for describing the relationships of resources in RESTful interfaces. Couple of proposals:
API versioning. You might need to change it in the future. If you version, you will not break existing API consumers.
some-domain.com/api/v1/resources/item/775
Standard timeformats. Don't invent your own format for expressing times, ISO 8601 is a better option.
"time": "D2014/07/08T11:23pm" // Do NOT do this
"time": "2014-11-13T17:45:11.647Z" //Do this
Pagination. Use offset and limit, and use them consistently.
/api/v1/books/2014/?offset=50&limit=25
Reference expansion (a.k.a. link expansion or entity expansion) is also a good idea: some/resource?expand=attribute+other_attribute
.
// from: http:/some-library-service/keyword/javascript?expand=title
{
"keywords": ["javascript"],
"books": [
{
"href": "http://some-library-service/isbn/978-0596517748",
"isbn": "978-0596517748",
"title": "JavaScript: The Good Parts"
}, {
"href": "http://some-library-service/isbn/978-1933988696",
"isbn": "978-1933988696"
"title": "Secrets of the JavaScript Ninja"
},
...
]
}
REST services can also invoke other (REST) services.
The stateless constraint somewhat limits service composition (creating task services)
Many-to-many mappings should be resources as well.
For example, consider API describing groups and projects that can have many-to-many relations.
/* A relationship between a group and a project */
{
"group": {
"id": 7,
"href": "http://..."
},
"project": {
"id": 3,
"href": "http://..."
},
"createdBy": "someUserId",
"created": "2014-10-28"
}
Deleting this resource will only remove the link, groups and projects will stay intact.
Avoid session based authentication; keeping up with the principle of statelessness, prefer to authenticate every request. Also prefer API Keys (like in GitLab) instead of username/password.
Authorize based on the content of the resource, not URLs which might change.
OAuth is a good option.
HTTP
has built-in support for caching. HTTP headers contain metadata about fetched resources and can be used to make condiditonal requests.
Conditional GET requests are only executed, when a certain condition is fulfilled, e.g. If-Modified-Since is only executed if a resource has been modified since the given date.
Images, JS libraries, CSS files, static pages, etc. should stay in cache as long as possible. Similarly, on rapidly changing pages, caching should be disabled or expire fast.
Caching is also really beneficial for REST APIs.
ETag is calculated from the resource contents. If it has not changed, the resource hasn't changed either.
Expires informs when the resource expires and should be fetched again.
Last-modified informs when the resource was last modified on the server.
Almost all APIs have some sort of usage policy. Typically the number of requests you make in a given time frame is limited, e.g. maximum of 10 requests in a minute.
Before you start utilizing some API in your web app, make sure you adhere to the usage policies.