Saturday, December 09, 2006

Thoughts on successful logging

Dealing with logging the last days, again, I thought it would be time to b-log about it too. So let us dive into the questions why programmers log, how programmers log and what the best and worst practices are on how to log.


Why logging?

Nearly every program and nearly every programmer does logging. And because of that, the purpose of logging changes through the lifecycle of a program.

At first, in "programming mode", logging normaly serves as status output and is used to validate the functioning of the newly written code. The simplest example for this would be a log-line in a block of code that is placed there to prove that the block is called anyway.
Introspection is another important thing at this point: If a programmer is not sure about the variables and values of a program at a certain point, he may simply output the values, the "state" of his program, using a log.
Logging also is useful to show off errors that do not directly occur in the programming code, such as a subsystem (like a database connection) went down etc. To know about these things is vital while programming: You never can be sure that your program works fine and dandy if the environment it works in isn't.

But as the program and the programming code evolves and grows more and more to a productive system, the intention of logging changes. The goal is not to monitor the code precisely any more but rather the modules of a program and their proper functionality. The main task of logging changes to: Survey the duration of operations, their failing or their success. This kind of log-mutation often comes smoothly and not abrupt or as a sheer matter of choice because the newly needed informations are usually already needed while programming, too.


How logging is done.

Logging exists as long as programming itself does. It starts by an ordinary output to the console via "println" and does not even end with high-level logging engines that are retrievable via network connections or external logging-client programs.

It always starts the same way. At first, just a couple of informations are merely printed to the console, after that, a little package is written that fulfills this task. Then, with rising complexity, the informations need to be categorized, otherwise crucial facts are lost in too many log-lines. By the way, logging means to trace the activity of a program over time, so the output has to be preserved in files or databases or repositories of any kind. And what about distributed systems on more than one computer, shouldn't they log to one place so that surveillance of only this log-source is possible? What started fairly simple can get pretty complex over time.

To not reinvent the wheel at this stage, there are a lot of logging APIs out there in the world. But they all boil down to some base functionality:
  • The methods to log are - more or less - simple to call
  • Different log-levels are introduced such as "debug", "trace", "info", "warn", "error", "fatal" so that the importance of a log-line can be classified
  • Most loggings have different channels that can be switched on or of for observations
  • Often, the log output can be canalized into a file or similiar data sinks
Anyway, fulfilling these preconditions do not define a good logging mechanism or API.


Best (and worst) practices doing the logdance.

Inserting a line of code that logs something always arises the same question: Is this correct, at this place, this time? Will this spam the log later on or is it accurately set here?

In first sight it is a good thing to log. If you create a system that will work for some time or reaches a certain degree of complexity it always is good to (at least vaguely) know what the system is doing right now. Or which conditions might have led to a failure of the code. Computer programs are systems that react on user or data and interact with their environment on many different variables and states. To write a program without logging is like having a black box you put input in and expect the correct output to be delivered. But then if something doesn't work proper you only have that black box to look at, without any further information why your program did not work.

But, as mentioned before, logging all and everything will not work out, either. Here are some suggestions on logging successful:
  • Try to log a state or error just once. Log (only) at the end of an action or an error-handling block. This will reduce verbosity and makes the log much more readable.
  • Think about to introduce prefixes to the log messages which will give a hint where the log message is placed in the code. Example: You have a module that parses URLs, so every message might start with "ParseURL:...". This way you directly know where the logging message comes from.
  • Do not only think of the place in the program code where you want to log a message, but consider the environment (variables, states, etc.) to be logged, too. This way it is more easy to recollect not only the place but the circumstances from a log message itself.
  • Keep in mind that the intention of logging will change as the program matures (described above). Sometimes spending 10 seconds more for a log message (i.e. to additionally show variables and states in the message) will greatly pay off later on.
So much for the "where to log" and "when to log", let's determine "how to log". As mentioned before, you can use own code or prebuild logging packages. Anyway logging should be made easy for the programmer (otherwise no one will use it) and treat computer ressources with respect (otherwise it will draw the performance of the program down).

Using prebuild packages is not a bad idea. Many functions that will be needed are available and can be used later on, even if at the start you are not aware that they will be useful. But be careful selecting the logger you want to use: It's use will spread through the whole program and changing it afterwards is nearly impossible. In general the development history of logging packages displays the idea what logging really should achive a little bit: From very simple modules to "all-in-one device suitable for every purpose" loggers, they nowadays return to a k.i.s.s-approach. Doing the log dance? Sure! But keep it simple, fast and easy-to-use, otherwise it will be counterproductive and no one will use it.

This article is continued at "How to successful log in programming"

Labels: ,