Art of Clean Code — Logging

Logs that can narrate the story of the system state in simple words

Mohit Gupta
Dev Genius

--

Pixabay

“In some ways, programming is like painting. You start with a blank canvas and certain basic raw materials. You use a combination of science, art, and craft to determine what to do with them.” — Andrew Hunt

My first lesson on the importance of logging was in 2001–02, from my manager and coach Puneet Agarwal. We were building a framework for Data Bound Swing Components which can be used to design any application dynamically using a Net Beans kind of IDE, to enable rapid application development.

The design was to extend Swing Components, attach metadata for data sources (and data) in context and pass it across component hierarchy using component life cycle events like add/removed notify, and extend paint functions to render the components using data.

This seems like a simple design but had complex implementation due to the hierarchical nature of UI elements. Data needs to be passed (and cleaned too) from parent to children and vice versa. This means any miss in this recursive flow (ex: even mishandling of data loading in repeated rendering events triggered by Swing) could cause issues in data loading, rendering, memory management, and on focus kind of user-facing features.

The only savior to debug in such a complex recursive system was the ‘Art of Logging’. This is when my manager taught me the importance of logs, including the right level of logs with the right level of information.

I still remember that lesson and follow it, that

Logging is an Art

Fast forward today, we have many advanced features in IDE. For example, breakpoints, debug mode, etc. These are making debugging much easier with live runtime data and control for start, restart, and stop of events.

Having said that, when there is an issue in server or distributed environments, logs are very useful to understand the issue in one go by looking at logged data. Even in a local development environment also, it is much easier to see the whole system state by reading the properly crafted logs in one go.

Hence effective logging is very important for clean code, a code that can be maintained easily for long, a code that can run in production without causing distress in nights for dev/ops/users and is easy to debug and fix quickly if there is an issue.

Let’s talk about a few of the important basics for effective logging, and hence to write a clean effective code.

Have a Common Understanding of ‘Format of Logs’

Yes, it is important. Refer to the previous blog here about clean code, where we discussed having standard formatting for codebase in a team, as it makes reading code easy. The same principle applies to logs also. If there are different kinds of log formatting, and patterns, it takes time to understand and makes sense of it.

We should be able to read logs like a Story . A story narrating the state of system in simple words

Hence, decide on a format together as a team, and follow it. Sample format:

  • A message telling stage, state of the system. state1[value1] state2[value2]
  • Error caused by <reason>. Expected behavior was <expectation>. errorMessage[message], exception stack trace if available
  • So on...

The format of logs is not that important, however, having a common understanding of format is.

Have a Common Understanding of ‘What to Log’

Similar to the format of logs, having a common understanding of what to log’ is equally important.

It helps to instill the habit of logging the right information as team culture, irrespective of components or developers. Ultimately, it helps to have symmetrical logs, with a similar level of information that anyone can read.

It helps anyone to makes sense of the system state and errors, and hence anyone can try to fix it. It helps to remove the dependency on one person or even one team sometimes, at least to understand the system state or cause of failure.

Example:

  • Incoming calls to the system with parameters, and response codes or states.
  • Outgoing calls, parameters, and responses from other services. (of course, after keeping security guidelines for logging in consideration)
  • and so on...

I experienced systems where Ops and Dev teams used to spend hours in calls just to understand the requests, parameters and response codes, etc.

It is worth spending quality time once to implement effective logs, rather than spending hours later and cause frustration to everyone, including customers.

Have a Common Understanding about ‘Level of Log’

Logging levels are important.

What if all the logs have been put in ‘info’, it will make the system super slow in a live environment. Logs, string operations have a significant impact on performance.

And what if most of the logs are in debug, and there is no way to understand the important states in the prod environment without enabling the debug mode. This will slow down the whole system. The same is for warning, error, and fatal kinds of modes too.

Define what should go where.

Do we want to log all incoming and outgoing calls in info to understand important system interaction data?

Do we want to log all intermediate important events in info?

Do we want to log all state changes in debug mode, so as to debug any issue by just switching these on?

Do we want to log DB access failure as an error, or it should be logged as fatal as we don't have any fallback DB to failover automatically.

Have a common understanding as Team and follow it.

Define Infra, Tools to Read the Logs

Defining tools and required infra to ingest, digest, and present log information in is very important. The system should help to retrieve logs easily, in an easy-to-read format.

If set up correctly, it saves a lot of effort (especially in distributed environments) of finding log files from different machines, collating data, and then making sense from multiple sources.

There are a lot many great log analysis tools available nowadays, both open source and commercial. Pick one which makes good sense for your product, team, and organization ecosystem and use it. It saves big efforts and saves the team from frustration too in case of production issues.

Another benefit, if all previous points are implemented well, anyone can pull the logs and can try to make sense of system status. Anyone can try to fix the issues or at least can contribute to it with good information.

Few examples of tools and frameworks: ELK, Graylog, Splunk, Sumologic, Logmatic. Here is one of the good blogs briefing about these tools.

Use all of the Above for Monitoring, Alerts, and Stats

Once logs are in place, the format and level of information are defined, tools to digest and present logs are configured. We are all set to utilize this information for a higher purpose, which is to make life even easier and to make products richer.

Setup monitoring on logged data which can help to raise alerts in case of any bad and good kind of events. Support teams and dev teams can be informed at the first sign of an issue in the system. It can give them chance to act proactively to manage the issue or user's expectations, or both.

An effective log monitoring system could be life savior by monitoring Vital Health signs of System proactively

Data generated by logs can be analyzed further to understand the usage pattern. This data can further be used to enhance, improve system resilience or product features. For example:

  • Analysis of traffic volume for all APIs can help to scale system and services proactively.
  • Analysis of usage patterns for features can help to define the strategy for future enhancements, by understanding user behavior proactively.

Summary

There are many good blogs and resources available online for logging best practices and for tools, which can help to research the best strategy. However, the most important aspect is to have a common understanding of log expectations, patterns, and analysis needs.

Once these basics are defined, agreed upon, and followed at the team level, the rest of the ecosystem can be built easily around that.

The right level of logging is important for system maintenance. The right level of logging can save many hours, days, and nights of effort for fellow team members and for customers too. Hence, it defines how professional we are for our responsibilities and how emphatic we are to our fellow team members and customers also.

More

Learn more aspects of clean coding in the following articles:

Art of Clean Code. Refer here.

Art of Clean Code — Error Handling. Refer here.

Art of Clean Code — Documentation. Refer here.

Happy Learning, Happy Coding!

Enjoyed reading this, please share, give a clap, and follow for similar stories!

For any suggestions, feel free to reach me on Linkedin: Mohit Gupta

--

--

Enjoy building great teams and products. Sharing my experience in software development, personal development, and leadership