Imagine this scenario—your Python application suddenly crashes in production, causing significant downtime, and you're scrambling to understand what went wrong. You have logs, but they're unstructured and difficult to interpret, making the troubleshooting process akin to finding a needle in a haystack. If only there were a more organized way to view and understand these logs…
Well, there is! With the right log structuring strategy, you can diagnose errors and anomalies easier and faster, making your Python application logs more valuable to your team and business.
As the core component of observability and debugging, logs are often overlooked or under-utilized due to a lack of proper structure and formatting. But when structured correctly, logs can provide invaluable insights about application behavior and usage patterns, making it easier to identify and fix issues, understand user behavior, and even inform product decisions. This blog post will guide you through the best practices of log structuring in your Python applications, helping you transform your logs from a cryptic data dump to a rich, searchable, and actionable source of insight, thereby maximizing your ROI.
Benefits of structured logging
Structured logging is the process of producing, transmitting, and storing log messages in a format that's easily machine-readable, such as JSON. The main advantage here is that by ensuring logs are structured consistently, you’ll get faster and more accurate automated processing and analysis.
In contrast to unstructured logs, which are just strings of text without a defined format, structured logs are designed with machine readability in mind. Each piece of data in the log message is stored in a defined field, allowing software to extract specific pieces of information without needing to parse arbitrary text strings.
Take a look at these logs:
Unstructured logs
INFO - User John Doe with ID 123 made a purchase of $200 on a Visa card with last four digits of 4312 on 2023-05-25.
Equivalent structured logs
{
"severity": "INFO",
"timestamp": "2023-05-25T12:34:56Z",
"userId": 123,
"userName": "John Doe",
"action": "purchase",
"card_type": "Visa",
"last_four_digits": 1414,
"amount": 200
}
With unstructured logging, troubleshooting an issue related to a specific transaction can become quite a daunting task. It might require writing complex search queries or even manually scanning through thousands of lines of logs. Structured logging, on the other hand, significantly improves readability for machines, which assists in easier and more efficient analysis and querying.
Getting started with structured logging in Python using structlog
While Python’s built-in logging module is robust and flexible, using a specialized library like structlog
can make structured logging more intuitive and easier to manage. The structlog
library wraps the built-in logging module to transform your logs into machine-readable key-value pairs.
To get started:
Install and import structlog
- Using pip, install
structlog
.
pip install structlog
- Within your Python application, import the library.
import structlog
Configure log processors
The structlog
library contains processor pipelines, which transform and route your log entries. They can add information to your logs (like timestamps), format the final log output (like converting it to JSON), filter logs, or even redirect logs to different targets.
structlog
comes with multiple included processors, including:
TimeStamper
: Adds timestamps to your logs.JSONRenderer
: Converts your log entries into JSON format.format_exc_info
: Extracts and formats exception information from exc_info field, if present.UnicodeEncoder
: Encodes Unicode strings in the event dictionary to UTF-8.
Have a look at this code sample, which implements some of these processors.
import structlog
# Configure structlog to output structured logs in JSON format
structlog.configure(
processors=[ structlog.stdlib.filter_by_level, structlog.processors.TimeStamper(fmt="iso"), structlog.processors.JSONRenderer()
],
context_class=dict, logger_factory=structlog.stdlib.LoggerFactory(),
)
# Get a logger
logger = structlog.get_logger()
# Now we can log structured messages!
logger.info("User logged in", user_id="1234", ip="192.0.2.0")
First, the example has configured structlog
to use several processors that control how messages are logged.
- The
TimeStamper
processor adds a timestamp to each log message, - The
JSONRenderer
processor converts the log message into a JSON string.
Next, the example uses a structlog
logger to record an information-level event. Notice how the log message includes key-value pairs: in this case, user_id="1234"
, ip="192.0.2.0"
. These are structured data that provide context for the log message.
When you run this code, it will output the following:
{"event": "User logged in", "user_id": "1234", "ip": "192.0.2.0", "timestamp": "2023-05-25T14:22:01Z"}
Remember, the order of processors matters. Each processor receives the output of the previous processor. Always consider this when configuring your processor pipeline, because it can lead to unintentional difficulties. For example, JSONRenderer
should always be at the end of your processor list, as it turns the dictionary into a string, making it unsuitable for further processing.
Best practices for configuring structured logs
Proper implementation of structured logging can greatly benefit your team and organization. Here are some best practices:
Querying structured logs in New Relic
Structured logging, when used in combination with NRQL, can open up a plethora of opportunities for deep and insightful analysis of your application’s logs. Here are a few examples of the power that structured logging brings to NRQL queries:
Aggregate user actions
With structured logging, you can easily group and count the actions taken by users. For instance, you could track how many times users perform a particular action, using this NRQL code:
SELECT count(*) FROM Log WHERE action = 'purchase' FACET userId
This query counts the number of 'purchase'
actions, broken down by individual user IDs, which is only possible if the action
and userId
fields are reliably structured in your logs.
Identifying errors
Structured logs make it easy to track error occurrences. You could use a query like:
SELECT count(*) FROM Log WHERE level = 'ERROR' FACET exceptionType SINCE 1 week ago
This will count all 'ERROR'
level logs over the past week, broken down by the type of exceptions that occurred. This is only possible if the level
and exceptionType
fields are consistently structured in your logs.
Analyzing performance metrics
If your logs include structured performance data, such as response times or system load, you can use NRQL to analyze these metrics:
SELECT average(responseTime), max(responseTime), min(responseTime) FROM Log WHERE action = 'request' SINCE 1 day ago
This query calculates the average, maximum, and minimum response time for all 'request'
actions over the past day. This sort of analysis is straightforward with structured logs but would be challenging or impossible with unstructured logs.
Providing insights into user behavior
Structured logging can provide deeper insights into user behavior. For instance, you can find out what the most common actions performed by our most active users are:
SELECT top(action, 5) FROM Log WHERE userId IN (SELECT userId FROM Log SINCE 1 week ago LIMIT 10)
This NRQL query selects the top five actions performed by the 10 most active users in the last week.
Conclusion
These examples underline the enhanced analytical power that structured logging brings when used with a powerful query language like NRQL. By adding structure to your logs, you'll be better equipped to understand and interpret your data, leading to more informed decision-making and more efficient problem-solving.
Next steps
Learn more about log management with New Relic and log management best practices.
Got lots of logs? Check out our tutorial on how to optimize and manage them.
To try it for yourself, if you’re not already using New Relic, sign up to get free, forever access.
The views expressed on this blog are those of the author and do not necessarily reflect the views of New Relic. Any solutions offered by the author are environment-specific and not part of the commercial solutions or support offered by New Relic. Please join us exclusively at the Explorers Hub (discuss.newrelic.com) for questions and support related to this blog post. This blog may contain links to content on third-party sites. By providing such links, New Relic does not adopt, guarantee, approve or endorse the information, views or products available on such sites.