Backend Modules

You shouldn’t need to actually have anything to do with these modules; interaction with the main Lumberjack object should be sufficient.

Actions

Provide the ActionQueue class.

class lumberjack.actions.ActionQueue(elasticsearch, config)

Hold a queue of actions and a thread to bulk-perform them.

This is instantiated automatically by the lumberjack.Lumberjack object. It will keep a queue of indexing actions to be performed in Elasticsearch, and perform them bulk (‘flush’) when one of three things happens:

  1. It has waited interval seconds without flushing, or
  2. The length of the queue has exceeded max_queue_length, or
  3. A flush is triggered manually.
Note:

You should not need to instantiate, or even interact with, this yourself. It is intended to be wrapped by lumberjack.Lumberjack. If you do, for some reason, use this yourself, it is a subclass of threading.Thread, so you should call its start() method after initialisation.

Parameters:
  • elasticsearch – The elasticsearch.Elasticsearch object on which to perform the bulk indexing.
  • config – The Lumberjack config. See the Configuration section in the docs for details.
queue_index(suffix, doc_type, body)

Queue a new document to be added to Elasticsearch.

If the queue becomes longer than self.max_queue_length then a flush is automatically triggered.

Parameters:
  • suffix – The suffix of the index into which we should index the document.
  • doc_type – The Elasticsearch type of the document to be indexed. Usually this should correspond to a registered schema in Lumberjack.
  • body – The actual document contents, as a dict.
run()

The main method for the ActionQueue thread.

Called by the start() method. Not to be called directly.

trigger_flush()

Manually trigger a flush of the queue.

This is to be called from the main thread, and fires an interrupt in the timeout of the main loop. As such it is not guaranteed to immediately trigger a flush, only to skip the countdown to the next one. This means the flush will happen the next time this thread gets switched to by the Python interpreter.

Schemas

Provides SchemaManager class.

class lumberjack.schemas.SchemaManager(elasticsearch, config)

Manage the ‘schemas’ for different types of log data.

A detailed description of schemas is given in the documentation for lumberjack.Lumberjack.register_schema.

This class manages a list of schemas registered and ensures that they are processed and passed into Elasticsearch as appropriate.

Parameters:
  • elasticsearch – The elasticsearch.Elasticsearch object to register mappings and templates with.
  • config – The Lumberjack config. See the Configuration section in the docs for details.
register_schema(logger, schema)

Take a new schema and add it to the roster.

This also automatically parses the schema into a mapping and adds it into the appropriate index template in Elasticsearch.

Parameters:
  • logger – The name of the logger which the log data will be emitted on.
  • schema – The schema data to be processed into a mapping.

Handler

Provide classes to fit into the Python logging framework.

class lumberjack.handler.ElasticsearchFormatter(fmt=None, datefmt=None)

Formatter which prepares logs for insertion into Elasticsearch.

format(record)

Add some metadata and deals with string logs.

It adds a @timestamp field and a level field. level contains the loglevel as an integer.

Log data should be in a dict, but to be compatible with the generic Python logging recommendations, it can also format log data received as a string. In this case, a dict is returned containing a single message field, whose data is the string message.

Parameters:record – The logging.LogRecord object to be formatted.
class lumberjack.handler.ElasticsearchHandler(action_queue, suffix_format='%Y.%m')

Elasticsearch-specific subclass of logging.LogHandler.

Parameters:
  • action_queue – A lumberjack.ActionQueue object to which the formatted log entries are passed.
  • suffix_format – The format from which to generate the time-based index suffixes for Elasticsearch. strftime() format.
emit(record)

Format the log and pass it to an ElasticsearchContext instance.

Generates the appropriate index time-suffix based on self.suffix_format.

Parameters:record – The logging.LogRecord object to format and index.