When we began testing our latest project, Limpid Markets – an online precious metals trading application, we knew that we needed to carry out thorough searches for bugs as there was a lot at stake.
Deciding what to use
The first thing we did was to remove all old
console.log statements. These were good for debugging during development, but became messy and difficult to follow and produced lots of noise in the logs.
After researching Node logging packages. We decided on Winston as it allowed us to have multiple transports and we were able to configure different levels of logging for our development and production environments. It also formats the log events nicely so they’re easy to search, but we still wanted a service to help us search the logs. We decided to try Loggly as we wanted something that was more efficient than grep.
Example grep command:
cat /var/log/main-api.sys.log | grep ‘2014-02-18.*info:’ -A 10 -B 10
Example of the Loggly interface (from their site):
We wanted to try Loggly as it provides a centralised place to store logs. This meant that we wouldn’t have to log in to the various servers to check our logs. This is especially useful to us as our application is distributed over multiple servers. Loggly also allows users to set up alerts, so we would know straight away if one of our sites were to go down.
How we set up logging
We started by putting meaningful logs into the app using the different levels provided by Winston. We started adding the
"info" level log statements in, to ensure that we can track requests through the application. To do this we set a variable on the request, and logged it out to the
"trans" property in the logs.
Once we were confident we could follow the request through the various stages, we started adding
"debug" log statements. As we were trying to fish out bugs, we used a strategy of logging out everything that passed and what it returned into a function. This allowed us to pin point where the request went wrong.
This initially didn’t work as we hoped. In each iteration we’d track the bugs, fix a few and then get lost with others. We added more logging (both debug and info) and iterated again. This process worked well as it was so fast, allowing us to deploy fixes multiple times in one day.
After the application went live, we quickly realised we need a way to alert us to any critical errors which were being thrown in our application. Sticking with Winston, I found Winston SNS. Similar to the Winston Loggly and Winston Console libraries we were already using, Winston SNS ties in to Amazon’s Simple Notification Service (SNS) and will publish a notification to a SNS topic. As all of our server stacks have an SNS topic in case the server encounters a problem, we were easily able to subscribe our application errors to the same topic. If Winston encounters an
"error" level log statement, a message is sent over the SNS Topic, and we are notified via email. This allows us to quickly react in the event something goes wrong.
Problems we encountered:
Deciding what to log was difficult. We chose the strategy of using the
“info” log level at just enough to allow us to work out where the transaction failed, and then to
“debug” log all variables that were passed to and returned from every function call. This high level of “debug” logging allowed us to clearly determine where the request went wrong.
During our testing period, Loggly were having trouble processing events. Having to wait 10 – 15 minutes after a testing session to be able to begin searching the logs was not acceptable for us. This meant we had to give up on Loggly and return to good ol’ fashioned searching of log files using grep.
Next time we will stick with a similar strategy, without Loggly, as it works well. It would be great to know what other people are using though.