Home

Design good APIs

As a starting point you may read the excellent abstract
of the most important design issues:
http://www.clifford.at/papers/2008/apidesign/

The following text is basically based on that paper, but
I added some ideas on various points. I will go into more
detail for libraries, not because it is more important,
but its much more strict and a very good internal API
would look like a interface for a library.

A lot of this information resides in DESIGN of the elektra
sources, the content was even extended with the following
information.

Idea: Every Program-module extends the programming language.
You can see this paradigma especially in forth, ps, squeak
or smalltalk but it is also valid for virtually any programming
language.

There is a tendency that student today learn more APIs then
data-structures, maybe to be able to handle todays real world
problems.

Hard to use it wrong tends to be more important than easy to
use it right. Searching for a stupid bug costs *much* more time
than falling into some standard traps which are explaned in
documentation, very good example:
http://www.boost.org/doc/libs/1_36_0/libs/serialization/doc/index.html
See: Compile time trap when saving a non-const value

strncpy vs. strncat show how to do it very wrong, its extremely
easy to use it right, but also to use it wrong.

The so called "telephone test" helps you to know if programs written
in your API can be pronounced easily. Simply read the code and
look if someone else understands it correctly.

Fundemental priniples are that the API must hide something (of course)
and should not be optimized towards speed. (But the code behind should
be!) It should be clear that nothing should be optimized without
doing a benchmark.

The unix philosophy should be always considered: Do only one thing
but do it in the best way. Write it that way that programs work
together well. They should work on text files, because this is a
universal interface.

Good examples of libraries are: stfl, epoll, pcre, curl and libelektra ;)

Always pass user context pointers, never use global variables for anything.
Even function pointers for e.g. malloc can be stored easily in user
handles. But consider not to provide such functionality, users which
really need a alternative allocation facility can also modify and compile the
library again (but use e.g. xmalloc, otherwise that would be pain too)

libxml2 did that sadly completely wrong (TODO: check if this holds for
all interfaces provided by libxml2).

A difficult point where you can't give a general solution is that you
need to find a metapher or paradigma for the function names or the way
to work with it. (e.g. trash bin for mac)

Use a consistent nameing schema, but its not important which:
E.g. CamelCase or with_underscores.
Also don't mix singular and plural.

A very important but not trivial problem is to give the same name for the
same thing, but another name for other things. It is difficult because
real-world languanges don't follow these rules (there are hundreds of
exceptions) either and introducing new words not present in languages
is often proscribed. For translations the same words need then to be
mapped to the same word. Any API not doing "yet another" thing needs
a glossary where the used words are clarified.

Off-by-one confusion is a topic for its own, best is to stick to the
conventions the programming language gives. For returning sizes of
objects it must be clear if a terminating '\0' is included or not.
After you decided something it must be consistent.

Internal and external APIs must be separated. Internal APIs in Libraries
should be static to be not exported.

The interface must be minimal to tackle all problems which the library
should solve.

Never pass out information about data-structures. Its impossible to be
ABI compatible otherwise. Be restrictive in what you return (postconditions)
but as liberal as possible for what you get (avoid preconditions where
possible, especially history constraints). So it might be a good idea to accept
NULL pointers if you return them. (See nickel ini library written
by Charles Lindsay see http://www.chaoslizard.org/devel/bohr/wiki/Docs/Ni)

An idea to provide a more easy entrace to your library is to provide
a interface for easy use. In elektra that is kdbGetKey and kdbGetString
functions (it directly gets a key or string from the database), in
curl the curl_easy functions.

Free everything you allocate is a very difficult topic in some cases
(e.g. locks in objective-c), but you can handle it with some techniques
in C or C++ (e.g. allocate, free in constructor/destructor) and
with tools like valgrind. If you can't always free after every call
(because it might be used later) provide a FreeAll or Close function.
In Elektra everything will be freed if you delete all keys and keysets
and close the database on exit.
Qt is a negative example, it always leave some parts allocated and e.g.
QColor is not a QObject, which makes it unclear how it gets freed.

Documentation is another large topic which can't be handeled here.
But let me say that documentation of data structures is chronically
under-documentated (e.g. in gcc not the functions are important but
the tree on which they work) and a big picture is often missing (in addition
to detailed doxygen documentation). (Here is Qt a good example)
Always provide a starting point in your README.

Defensive programming is not a good idea. Say clearly what you expect
(preconditions) and don't check that again! Code that checks something
is also error-prone and does not lead to better programs. Of course
user-input must always be checked!

Always provide a default branch for the impossible case, it might be
possible some day (e.g. overflow in variables).

Support as many binding languages as possible, use swig for that.
TOP Contact me
Fr Jul 30 13:54:24 CEST 2021