Creating specio plugins

Specio is plugin-based. Every supported format is provided with a plugin. You can write your own plugins to make specio support additional formats. And we would be interested in adding such code to the specio codebase!

What is a plugin

In specio, a plugin provides one or more Format objects, and corresponding Reader class. Each Format object represents an implementation to read a particular file format. Its Reader classes do the actual reading.

The reader object has a request attribute that can be used to obtain information about the read Request, such as user-provided keyword arguments, as well get access to the raw image data.

Registering

Strictly speaking a format can be used stand alone. However, to allow specio to automatically select it for a specific file, the format must be registered using specio.formats.add_format().

Note that a plugin is not required to be part of the specio package; as long as a format is registered, specio can use it. This makes specio very easy to extend.

What methods to implement

Specio is designed such that plugins only need to implement a few private methods. The public API is implemented by the base classes. In effect, the public methods can be given a descent docstring which does not have to be repeated at the plugins.

For the Format class, the following needs to be implemented/specified:

  • The format needs a short name, a description, and a list of file extensions that are common for the file-format in question. These ase set when instantiation the Format object.
  • Use a docstring to provide more detailed information about the format/plugin, such as parameters for reading and saving that the user can supply via keyword arguments.
  • Implement _can_read(request), return a bool. See also the Request class.

For the Format.Reader class:

  • Implement _open(**kwargs) to initialize the reader. Deal with the user-provided keyword arguments here.
  • Implement _close() to clean up.
  • Implement _get_length() to provide a suitable length based on what the user expects. Can be inf for streaming data.
  • Implement _get_data(index) to return an array and a meta-data dict.
  • Implement _get_meta_data(index) to return a meta-data dict. If index is None, it should return the ‘global’ meta-data.

Example / template plugin

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
"""Example plugin. You can use this as a template for your own plugin."""

# Copyright (c) 2017
# Authors: Guillaume Lemaitre <guillaume.lemaitre@inria.fr>
# License: BSD 3 clause

from __future__ import absolute_import, print_function, division

import numpy as np

from .. import formats
from ..core import Format
from ..core import Spectrum


class DummyFormat(Format):
    """ The dummy format is an example format that does nothing.
    It will never indicate that it can read a file. When
    explicitly asked to read, it will simply read the bytes.

    This documentation is shown when the user does ``help('thisformat')``.

    Parameters
    ----------
    Specify arguments in numpy doc style here.

    Attributes
    ----------
    Specify the specific attributes that can be useful.

    """

    def _can_read(self, request):
        # This method is called when the format manager is searching
        # for a format to read a certain image. Return True if this format
        # can do it.
        #
        # The format manager is aware of the extensions
        # that each format can handle. It will first ask all formats
        # that *seem* to be able to read it whether they can. If none
        # can, it will ask the remaining formats if they can: the
        # extension might be missing, and this allows formats to provide
        # functionality for certain extensions, while giving preference
        # to other plugins.
        #
        # If a format says it can, it should live up to it. The format
        # would ideally check the request.firstbytes and look for a
        # header of some kind.
        #
        # The request object has:
        # request.filename: a representation of the source (only for reporting)
        # request.firstbytes: the first 256 bytes of the file.

        if request.filename.lower().endswith(self.extensions):
            return True
        return False
    # -- reader

    class Reader(Format.Reader):

        def _open(self, some_option=False, length=1):
            # Specify kwargs here. Optionally, the user-specified kwargs
            # can also be accessed via the request.kwargs object.
            #
            # The request object provides two ways to get access to the
            # data. Use just one:
            #  - Use request.get_file() for a file object (preferred)
            #  - Use request.get_local_filename() for a file on the system
            self._fp = self.request.get_file()
            self._length = length  # passed as an arg in this case for testing
            self._data = None

        def _close(self):
            # Close the reader.
            # Note that the request object will close self._fp
            pass

        def _get_length(self):
            # Return the number of images. Can be np.inf
            return self._length

        def _get_data(self, index=None):
            # Return the data and meta data for the given index
            if index is not None and index >= self._length:
                raise IndexError('Image index %i > %i' % (index, self._length))
            # Read all bytes
            if self._data is None:
                self._data = self._fp.read()
            # Put in a numpy array
            spec = np.frombuffer(self._data, 'uint8')
            spec = spec[np.newaxis, :]
            # Return array and dummy meta data
            return Spectrum(spec, np.squeeze(spec), {})

        def _get_meta_data(self, index):
            # Get the meta data for the given index. If index is None, it
            # should return the global meta data.
            return {}  # This format does not support meta data


# Register. You register an *instance* of a Format class. Here specify:
format = DummyFormat('dummy',  # short name
                     'An example format that does nothing.',  # one line descr.
                     '.foobar .nonexistentext',  # list of extensions
                     )
formats.add_format(format)