notes-computer-ABusyProgrammersGuideToWebStandards

Note: a more up-to-date, community-edited version can be found at http://www.emacswiki.org/cgi-bin/community/ABusyDevelopersGuideToWebStandards

Unlike other "Busy Developer's Guide"s I've seen, this one will be really concise, for people who are really busy. Being busy, I haven't really kept up with this stuff myself, and the following are sort of guesses. Lemme know if they're wrong.

Some concepts

"Syntax" is like grammar. It concerns only whether a document follows the grammatical rules or not, e.g. "capitalize the first letter of sentences", etc.

"Semantics" is the meanings which you attach to the things in the document.

For an example of the difference between them: the sentence "Colourless green ideas sleep furiously" is fine syntactically, but needs some help in the semantics department.

A "Protocol" specifies the form of data which travels "on the wire", that is, the specific form of the data being sent over the internet.

An "API" specifies the interface that an application uses to communicate with some service, usually a set of callable functions with parameters and definitions of what they return.

APIs and protocols are similar; to a developer, they both define a set of names that you use when interacting with some other program that does things for your program. The difference is whether or not the specification is telling you the form of data going over the internet. Also, usually APIs are language-specific whereas protocols are language-independent.

Basic stuff

A URI is a fancy term for a URL, although there are those who will disagree. It has the philosophical implication that the URI is "just a name", and that there might not actually be any document at the given location. For example, I might say, "Hi, my name is Bayle Shanks, and my URI is "www.fake.name/Shanks/Bayle/TheOneWhoLivesInSanDiego". If you type that into your web browser you probably won't find a web page there. But, that URI can still be used as a less-ambiguous identifier for me (compared to "Bayle", although in my case that's not too ambiguous as it is).

XML is a format for formats, i.e. a format for the syntax of a generalized family of HTML-like markup languages.

DTD ("document type definition") is a format for actually defining the syntax of a customized HTML-like markup language.

For example, you might use an XML DTD to define another language which permits documents with stuff like <transaction name="bought socks"><amount>$10</amount></transaction>. The DTD would essentially say things sorta like, "This language has an element called "transaction", which has a attribute of type "name". Within a "transaction" element, you can have an "amount" element". You might call your new made-up language "StupidML". Documents in your new made-up language would be StupidML, but they would also be XML, since StupidML is a subset of XML

More advanced stuff

RSS is a language for syndicating headlines, that is, for providing lists similar to RecentChanges?. There's actually a couple of incompatible RSS standards with different version numbers.

WebDAV? is a protocol that extends HTTP to deal with remote editing of web resources. Standard HTTP had "PUT" and "POST" to let you write to web resources, but WebDAV? adds more stuff to help with this sort of thing. If you're writing to web resources, you probably want to use WebDAV?, not just PUT. Intro article. Links. RFC 2518. IETF working group.

DeltaV? is a protocol that extends HTTP to do versioning for web resources. I.e. "hey, show me what this web page looked like last tuesday". Note that WebDAV? is a prereq for DeltaV?. Links. IETF working group. RFC3253 with Errata Applied. In my opinion, DeltaV? seems to have not caught on and seems to be fading away.

LDAP ("lightweight directory access protocol") is a database access protocol & query language. LDAP lets you query servers for information, like "give me the email addresses of everyone in the database who lives in Detroit and whose last name is Smith". It also lets you change the information in the database. LDAP does authentication & permissions. LDAP is a replacement for an older protocol called X.500. LDAP isn't really a web standard (it's not over HTTP), but I included it anyway.

Atom is a protocol for interaction with weblogs (i.e. for weblog authors to remotely post their stories, as well as other stuff). It's also a format for feeds (like RSS).

DASL is a non-standardized future protocol that extends HTTP to allow submission of queries & searching in a standard way. Note that WebDAV? is a prerequisite for DASL.

pubsubhubbub is a push extension to atom and rss. Sometimes abbreviated PuSH?

XMPP is a format for sending chat (IM) messages and status information

AMQP is another format for sending chat messages

Salmon Protocol "As updates and content flow in real time around the Web, conversations around the content are becoming increasingly fragmented into individual silos. Salmon aims to define a standard protocol for comments and annotations to swim upstream to original update sources..."

/.well-known is a reserved URI path on every server. There is a worldwide central registry that tells you what the URIs there mean (RFC 5785). summary. (my opinion: too bad it has to be at the server document root -- this means that whoever controls the data in the well-known locations has to own their own domain or subdomain -- subdomains are still too hard to deploy, so you don't yet see, e.g. google sites giving every user their own subdomain for free, although this is becoming somewhat common (e.g. you can get your own subdomain for your blog at wordpress.com --- in any case, wouldn't it have been nicer to have a .well-known subdirectories available anywhere along a URI path?)

People-centric

OStatus "lets people on different social networks follow each other". Related to StatusNet?/identi.ca.

WebFinger?

ActivityStreams? "is an extension to the Atom feed format to express what people are doing around the web"

Portable Contacts "a secure way to access their address books and friends lists without having to take their credentials or scrape their data"

OpenID? is an identity/authentication system

FOAF+SSL is an identity/authentication system (see also OpenID-FOAF+SSL bridge)

Web services specific stuff

Web services means that one program on your computer calls a subroutine in another program, and the other program is running on a computer in Los Angeles. Your computer passes arguments to the other computer via HTTP, and then the remote computer runs the subroutine over there, and then it returns the result to your computer via HTTP.

SOAP is a protocol for calling program subroutines remotely. It also specifies how to encode some data types (strings, integers, etc) in XML. For example, I want to call a calculator program in Los Angeles to add 2 + 2. I know the URL and the port and all that already. I tell SOAP, "call subroutine "add" at this URL, and pass it two integer arguments, "2" and "2"". SOAP passes on the information, the remote computer does the computation, and then SOAP tells me, "the result was "4"".

XML-RPC is an earlier protocol to do the same thing as SOAP. I think SOAP is the way to go nowadays.

WSDL is a language for describing the API of a web service. For example, a WSDL document might say, "Program 'Calculator' has four subroutines, 'int add(int, int)', 'int subtract(int, int)', 'int multiply(int, int)' and 'real divide(int, int)'.

More advanced XML-specific stuff

XPath is a standard langugae for addressing information within XML documents. It allows you to refer to things like "the <title> element of this XML document".

XQuery is a language for writing queries for information within XML documents. It's like SQL for XML.

RDF ("resource description framework") is a language for talking about properties of URIs. E.g. it says things like "the owner of http://microsoft.com is Microsoft Corporation" or "The sex of www.fake.name/Shanks/Bayle/TheOneWhoLivesInSanDiego is "male"".

XSLT is a language for defining a transformation to be applied to XML documents. For example, an XSLT transform might be used transform XML documents into HTML documents.

Some wiki-specific things

ModWiki? is a module for RSS to define a sublanguage for talking about properties of Wiki Changes, such as "importance" (minor edit/major edit).

WikiXmlRpc? is a (non-standardized) API for automated, remote interaction with a wiki (you send the wiki commands like getPage, putPage, etc).

Some things I don't know yet

I dunno exactly what JXTA is. Something about a specification for infrastructure for P2P stuff.

Contributors: Bayle Shanks

Collection of References

Wiki developers: Oddmuse:References is what AlexSchroeder? uses for his wiki development stuff.