airfs.io

Cloud storage abstract IO classes

Theses abstract classes are used as base to implement storage specific IO classes

class airfs.io.ObjectRawIOBase(name, mode='r', storage_parameters=None, **kwargs)

Base class for binary cloud storage object I/O.

In write mode, this class needs enough memory to store the entire object to write. In append mode, the cloud object is read and stored in memory on instantiation. For big objects use ObjectBufferedIOBase that can performs operations with less memory.

In read mode, this class random access to the cloud object and require only the accessed data size in memory.

Parameters:
  • name (path-like object) – URL or path to the file which will be opened.
  • mode (str) – The mode can be ‘r’, ‘w’, ‘a’, ‘x’ for reading (default), writing or appending
  • storage_parameters (dict) – Storage configuration parameters. Generally, client configuration and credentials.
  • unsecure (bool) – If True, disables TLS/SSL to improves transfer performance. But makes connection unsecure.
close()[source]

Flush the write buffers of the stream if applicable and close the object.

fileno()

Returns underlying file descriptor if one exists.

OSError is raised if the IO object does not use a file descriptor.

flush()[source]

Flush the write buffers of the stream if applicable and save the object on the cloud.

isatty()

Return whether this is an ‘interactive’ stream.

Return False if it can’t be determined.

mode

The mode.

Returns:Mode.
Return type:str
name

The file name.

Returns:Name.
Return type:str
readable()

Return True if the stream can be read from. If False, read() will raise OSError.

Returns:Supports reading.
Return type:bool
readall()[source]

Read and return all the bytes from the stream until EOF.

Returns:Object content
Return type:bytes
readinto(b)[source]

Read bytes into a pre-allocated, writable bytes-like object b, and return the number of bytes read.

Parameters:b (bytes-like object) – buffer.
Returns:number of bytes read
Return type:int
readline()

Read and return a line from the stream.

If size is specified, at most size bytes will be read.

The line terminator is always b’n’ for binary files; for text files, the newlines argument to open can be used to select the line terminator(s) recognized.

readlines()

Return a list of lines from the stream.

hint can be specified to control the number of lines read: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds hint.

seek(offset, whence=0)[source]

Change the stream position to the given byte offset.

Parameters:
  • offset (int) – Offset is interpreted relative to the position indicated by whence.
  • whence (int) – The default value for whence is SEEK_SET. Values for whence are: SEEK_SET or 0 – start of the stream (the default); offset should be zero or positive SEEK_CUR or 1 – current stream position; offset may be negative SEEK_END or 2 – end of the stream; offset is usually negative
Returns:

The new absolute position.

Return type:

int

seekable()

Return True if the stream supports random access. If False, seek(), tell() and truncate() will raise OSError.

Returns:Supports random access.
Return type:bool
tell()

Return the current stream position.

Returns:Stream position.
Return type:int
truncate()

Truncate file to size bytes.

File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Returns the new size.

writable()

Return True if the stream supports writing. If False, write() and truncate() will raise OSError.

Returns:Supports writing.
Return type:bool
write(b)[source]

Write the given bytes-like object, b, to the underlying raw stream, and return the number of bytes written.

Parameters:b (bytes-like object) – Bytes to write.
Returns:The number of bytes written.
Return type:int
class airfs.io.ObjectBufferedIOBase(name, mode='r', buffer_size=None, max_buffers=0, max_workers=None, **kwargs)

Base class for buffered binary cloud storage object I/O

Parameters:
  • name (path-like object) – URL or path to the file which will be opened.
  • mode (str) – The mode can be ‘r’ (default), ‘w’. for reading (default) or writing
  • buffer_size (int) – The size of buffer.
  • max_buffers (int) – The maximum number of buffers to preload in read mode or awaiting flush in write mode. 0 for no limit.
  • max_workers (int) – The maximum number of threads that can be used to execute the given calls.
  • storage_parameters (dict) – Storage configuration parameters. Generally, client configuration and credentials.
  • unsecure (bool) – If True, disables TLS/SSL to improves transfer performance. But makes connection unsecure.
close()[source]

Flush the write buffers of the stream if applicable and close the object.

detach()

Disconnect this buffer from its underlying raw stream and return it.

After the raw stream has been detached, the buffer is in an unusable state.

fileno()

Returns underlying file descriptor if one exists.

OSError is raised if the IO object does not use a file descriptor.

flush()[source]

Flush the write buffers of the stream if applicable.

isatty()

Return whether this is an ‘interactive’ stream.

Return False if it can’t be determined.

mode

The mode.

Returns:Mode.
Return type:str
name

The file name.

Returns:Name.
Return type:str
peek(size=-1)[source]

Return bytes from the stream without advancing the position.

Parameters:size (int) – Number of bytes to read. -1 to read the full stream.
Returns:bytes read
Return type:bytes
raw

The underlying raw stream

Returns:Raw stream.
Return type:ObjectRawIOBase subclass
read(size=-1)[source]

Read and return up to size bytes, with at most one call to the underlying raw stream’s.

Use at most one call to the underlying raw stream’s read method.

Parameters:size (int) – Number of bytes to read. -1 to read the stream until end.
Returns:Object content
Return type:bytes
read1(size=-1)[source]

Read and return up to size bytes, with at most one call to the underlying raw stream’s.

Use at most one call to the underlying raw stream’s read method.

Parameters:size (int) – Number of bytes to read. -1 to read the stream until end.
Returns:Object content
Return type:bytes
readable()

Return True if the stream can be read from. If False, read() will raise OSError.

Returns:Supports reading.
Return type:bool
readinto(b)[source]

Read bytes into a pre-allocated, writable bytes-like object b, and return the number of bytes read.

Parameters:b (bytes-like object) – buffer.
Returns:number of bytes read
Return type:int
readinto1(b)[source]

Read bytes into a pre-allocated, writable bytes-like object b, and return the number of bytes read.

Use at most one call to the underlying raw stream’s readinto method.

Parameters:b (bytes-like object) – buffer.
Returns:number of bytes read
Return type:int
readline()

Read and return a line from the stream.

If size is specified, at most size bytes will be read.

The line terminator is always b’n’ for binary files; for text files, the newlines argument to open can be used to select the line terminator(s) recognized.

readlines()

Return a list of lines from the stream.

hint can be specified to control the number of lines read: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds hint.

seek(offset, whence=0)[source]

Change the stream position to the given byte offset.

Parameters:
  • offset – Offset is interpreted relative to the position indicated by whence.
  • whence – The default value for whence is SEEK_SET. Values for whence are: SEEK_SET or 0 – start of the stream (the default); offset should be zero or positive SEEK_CUR or 1 – current stream position; offset may be negative SEEK_END or 2 – end of the stream; offset is usually negative
Returns:

The new absolute position.

Return type:

int

seekable()

Return True if the stream supports random access. If False, seek(), tell() and truncate() will raise OSError.

Returns:Supports random access.
Return type:bool
tell()

Return the current stream position.

Returns:Stream position.
Return type:int
truncate()

Truncate file to size bytes.

File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Returns the new size.

writable()

Return True if the stream supports writing. If False, write() and truncate() will raise OSError.

Returns:Supports writing.
Return type:bool
write(b)[source]

Write the given bytes-like object, b, to the underlying raw stream, and return the number of bytes written.

Parameters:b (bytes-like object) – Bytes to write.
Returns:The number of bytes written.
Return type:int
class airfs.io.SystemBase(storage_parameters=None, unsecure=False, roots=None, **_)

Cloud storage system handler.

This class subclasses are not intended to be public and are implementation details.

This base system is for Object storage that does not handles files with a true hierarchy like file systems. Directories are virtual with this kind of storage.

Parameters:
  • storage_parameters (dict) – Storage configuration parameters. Generally, client configuration and credentials.
  • unsecure (bool) – If True, disables TLS/SSL to improves transfer performance. But makes connection unsecure.
  • roots (tuple) – Tuple of roots to force use.
client

Storage client

Returns:client
copy(src, dst, other_system=None)[source]

Copy object of the same storage.

Parameters:
  • src (str) – Path or URL.
  • dst (str) – Path or URL.
  • other_system (airfs._core.io_system.SystemBase subclass) – Other storage system. May be required for some storage.
ensure_dir_path(path, relative=False)[source]

Ensure the path is a dir path.

Should end with ‘/’ except for schemes and locators.

Parameters:
  • path (str) – Path or URL.
  • relative (bool) – Path is relative to current root.
Returns:

dir path

Return type:

path

exists(path=None, client_kwargs=None, assume_exists=None)[source]

Return True if path refers to an existing path.

Parameters:
  • path (str) – Path or URL.
  • client_kwargs (dict) – Client arguments.
  • assume_exists (bool or None) – This value define the value to return in the case there is no enough permission to determinate the existing status of the file. If set to None, the permission exception is reraised (Default behavior). if set to True or False, return this value.
Returns:

True if exists.

Return type:

bool

get_client_kwargs(path)[source]

Get base keyword arguments for client for a specific path.

Parameters:path (str) – Absolute path or URL.
Returns:client args
Return type:dict
getctime(path=None, client_kwargs=None, header=None)[source]

Return the creation time of path.

Parameters:
  • path (str) – File path or URL.
  • client_kwargs (dict) – Client arguments.
  • header (dict) – Object header.
Returns:

The number of seconds since the epoch

(see the time module).

Return type:

float

getmtime(path=None, client_kwargs=None, header=None)[source]

Return the time of last access of path.

Parameters:
  • path (str) – File path or URL.
  • client_kwargs (dict) – Client arguments.
  • header (dict) – Object header.
Returns:

The number of seconds since the epoch

(see the time module).

Return type:

float

getsize(path=None, client_kwargs=None, header=None)[source]

Return the size, in bytes, of path.

Parameters:
  • path (str) – File path or URL.
  • client_kwargs (dict) – Client arguments.
  • header (dict) – Object header.
Returns:

Size in bytes.

Return type:

int

head(path=None, client_kwargs=None, header=None)[source]

Returns object HTTP header.

Parameters:
  • path (str) – Path or URL.
  • client_kwargs (dict) – Client arguments.
  • header (dict) – Object header.
Returns:

HTTP header.

Return type:

dict

is_locator(path, relative=False)[source]

Returns True if path refer to a locator.

Depending the storage, locator may be a bucket or container name, a hostname, …

Parameters:
  • path (str) – path or URL.
  • relative (bool) – Path is relative to current root.
Returns:

True if locator.

Return type:

bool

isdir(path=None, client_kwargs=None, virtual_dir=True, assume_exists=None)[source]

Return True if path is an existing directory.

Parameters:
  • path (str) – Path or URL.
  • client_kwargs (dict) – Client arguments.
  • virtual_dir (bool) – If True, checks if directory exists virtually if an object path if not exists as a specific object.
  • assume_exists (bool or None) – This value define the value to return in the case there is no enough permission to determinate the existing status of the file. If set to None, the permission exception is reraised (Default behavior). if set to True or False, return this value.
Returns:

True if directory exists.

Return type:

bool

isfile(path=None, client_kwargs=None, assume_exists=None)[source]

Return True if path is an existing regular file.

Parameters:
  • path (str) – Path or URL.
  • client_kwargs (dict) – Client arguments.
  • assume_exists (bool or None) – This value define the value to return in the case there is no enough permission to determinate the existing status of the file. If set to None, the permission exception is reraised (Default behavior). if set to True or False, return this value.
Returns:

True if file exists.

Return type:

bool

Returns True if object is a symbolic link.

Parameters:
  • path (str) – File path or URL.
  • header (dict) – Object header.
Returns:

True if object is Symlink.

Return type:

bool

list_objects(path='', relative=False, first_level=False, max_request_entries=None)[source]

List objects.

Parameters:
  • path (str) – Path or URL.
  • relative (bool) – Path is relative to current root.
  • first_level (bool) – It True, returns only first level objects. Else, returns full tree.
  • max_request_entries (int) – If specified, maximum entries returned by request.
Returns:

object name str, object header dict

Return type:

generator of tuple

make_dir(path, relative=False)[source]

Make a directory.

Parameters:
  • path (str) – Path or URL.
  • relative (bool) – Path is relative to current root.
relpath(path)[source]

Get path relative to storage.

Parameters:path (str) – Absolute path or URL.
Returns:relative path.
Return type:str
remove(path, relative=False)[source]

Remove an object.

Parameters:
  • path (str) – Path or URL.
  • relative (bool) – Path is relative to current root.
roots

Return URL roots for this storage.

Returns:URL roots
Return type:tuple of str
split_locator(path)[source]

Split the path into a pair (locator, path).

Parameters:path (str) – Absolute path or URL.
Returns:locator, path.
Return type:tuple of str
stat(path=None, client_kwargs=None, header=None)[source]

Get the status of an object.

Parameters:
  • path (str) – File path or URL.
  • client_kwargs (dict) – Client arguments.
  • header (dict) – Object header.
Returns:

Stat result object. Follow the “os.stat_result”

specification and may contain storage dependent extra entries.

Return type:

namedtuple

storage

Storage name

Returns:Storage
Return type:str
storage_parameters

Storage parameters

Returns:Storage parameters
Return type:dict
class airfs.io.ObjectRawIORandomWriteBase(name, mode='r', storage_parameters=None, **kwargs)

Base class for binary cloud storage object I/O that support flushing parts of file instead of requiring flushing the full file at once.

close()

Flush the write buffers of the stream if applicable and close the object.

fileno()

Returns underlying file descriptor if one exists.

OSError is raised if the IO object does not use a file descriptor.

flush()[source]

Flush the write buffers of the stream if applicable and save the object on the cloud.

isatty()

Return whether this is an ‘interactive’ stream.

Return False if it can’t be determined.

mode

The mode.

Returns:Mode.
Return type:str
name

The file name.

Returns:Name.
Return type:str
readable()

Return True if the stream can be read from. If False, read() will raise OSError.

Returns:Supports reading.
Return type:bool
readall()

Read and return all the bytes from the stream until EOF.

Returns:Object content
Return type:bytes
readinto(b)

Read bytes into a pre-allocated, writable bytes-like object b, and return the number of bytes read.

Parameters:b (bytes-like object) – buffer.
Returns:number of bytes read
Return type:int
readline()

Read and return a line from the stream.

If size is specified, at most size bytes will be read.

The line terminator is always b’n’ for binary files; for text files, the newlines argument to open can be used to select the line terminator(s) recognized.

readlines()

Return a list of lines from the stream.

hint can be specified to control the number of lines read: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds hint.

seek(offset, whence=0)[source]

Change the stream position to the given byte offset.

Parameters:
  • offset (int) – Offset is interpreted relative to the position indicated by whence.
  • whence (int) – The default value for whence is SEEK_SET. Values for whence are: SEEK_SET or 0 – start of the stream (the default); offset should be zero or positive SEEK_CUR or 1 – current stream position; offset may be negative SEEK_END or 2 – end of the stream; offset is usually negative
Returns:

The new absolute position.

Return type:

int

seekable()

Return True if the stream supports random access. If False, seek(), tell() and truncate() will raise OSError.

Returns:Supports random access.
Return type:bool
tell()

Return the current stream position.

Returns:Stream position.
Return type:int
truncate()

Truncate file to size bytes.

File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Returns the new size.

writable()

Return True if the stream supports writing. If False, write() and truncate() will raise OSError.

Returns:Supports writing.
Return type:bool
write(b)

Write the given bytes-like object, b, to the underlying raw stream, and return the number of bytes written.

Parameters:b (bytes-like object) – Bytes to write.
Returns:The number of bytes written.
Return type:int
class airfs.io.ObjectBufferedIORandomWriteBase(name, mode='r', buffer_size=None, max_buffers=0, max_workers=None, **kwargs)

Buffered base class for binary cloud storage object I/O that support flushing parts of file instead of requiring flushing the full file at once.

close()

Flush the write buffers of the stream if applicable and close the object.

detach()

Disconnect this buffer from its underlying raw stream and return it.

After the raw stream has been detached, the buffer is in an unusable state.

fileno()

Returns underlying file descriptor if one exists.

OSError is raised if the IO object does not use a file descriptor.

flush()

Flush the write buffers of the stream if applicable.

isatty()

Return whether this is an ‘interactive’ stream.

Return False if it can’t be determined.

mode

The mode.

Returns:Mode.
Return type:str
name

The file name.

Returns:Name.
Return type:str
peek(size=-1)

Return bytes from the stream without advancing the position.

Parameters:size (int) – Number of bytes to read. -1 to read the full stream.
Returns:bytes read
Return type:bytes
raw

The underlying raw stream

Returns:Raw stream.
Return type:ObjectRawIOBase subclass
read(size=-1)

Read and return up to size bytes, with at most one call to the underlying raw stream’s.

Use at most one call to the underlying raw stream’s read method.

Parameters:size (int) – Number of bytes to read. -1 to read the stream until end.
Returns:Object content
Return type:bytes
read1(size=-1)

Read and return up to size bytes, with at most one call to the underlying raw stream’s.

Use at most one call to the underlying raw stream’s read method.

Parameters:size (int) – Number of bytes to read. -1 to read the stream until end.
Returns:Object content
Return type:bytes
readable()

Return True if the stream can be read from. If False, read() will raise OSError.

Returns:Supports reading.
Return type:bool
readinto(b)

Read bytes into a pre-allocated, writable bytes-like object b, and return the number of bytes read.

Parameters:b (bytes-like object) – buffer.
Returns:number of bytes read
Return type:int
readinto1(b)

Read bytes into a pre-allocated, writable bytes-like object b, and return the number of bytes read.

Use at most one call to the underlying raw stream’s readinto method.

Parameters:b (bytes-like object) – buffer.
Returns:number of bytes read
Return type:int
readline()

Read and return a line from the stream.

If size is specified, at most size bytes will be read.

The line terminator is always b’n’ for binary files; for text files, the newlines argument to open can be used to select the line terminator(s) recognized.

readlines()

Return a list of lines from the stream.

hint can be specified to control the number of lines read: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds hint.

seek(offset, whence=0)

Change the stream position to the given byte offset.

Parameters:
  • offset – Offset is interpreted relative to the position indicated by whence.
  • whence – The default value for whence is SEEK_SET. Values for whence are: SEEK_SET or 0 – start of the stream (the default); offset should be zero or positive SEEK_CUR or 1 – current stream position; offset may be negative SEEK_END or 2 – end of the stream; offset is usually negative
Returns:

The new absolute position.

Return type:

int

seekable()

Return True if the stream supports random access. If False, seek(), tell() and truncate() will raise OSError.

Returns:Supports random access.
Return type:bool
tell()

Return the current stream position.

Returns:Stream position.
Return type:int
truncate()

Truncate file to size bytes.

File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Returns the new size.

writable()

Return True if the stream supports writing. If False, write() and truncate() will raise OSError.

Returns:Supports writing.
Return type:bool
write(b)

Write the given bytes-like object, b, to the underlying raw stream, and return the number of bytes written.

Parameters:b (bytes-like object) – Bytes to write.
Returns:The number of bytes written.
Return type:int
class airfs.io.FileSystemBase(storage_parameters=None, unsecure=False, roots=None, **_)

Cloud storage system handler with true file hierarchy.

client

Storage client

Returns:client
copy(src, dst, other_system=None)

Copy object of the same storage.

Parameters:
  • src (str) – Path or URL.
  • dst (str) – Path or URL.
  • other_system (airfs._core.io_system.SystemBase subclass) – Other storage system. May be required for some storage.
ensure_dir_path(path, relative=False)

Ensure the path is a dir path.

Should end with ‘/’ except for schemes and locators.

Parameters:
  • path (str) – Path or URL.
  • relative (bool) – Path is relative to current root.
Returns:

dir path

Return type:

path

exists(path=None, client_kwargs=None, assume_exists=None)

Return True if path refers to an existing path.

Parameters:
  • path (str) – Path or URL.
  • client_kwargs (dict) – Client arguments.
  • assume_exists (bool or None) – This value define the value to return in the case there is no enough permission to determinate the existing status of the file. If set to None, the permission exception is reraised (Default behavior). if set to True or False, return this value.
Returns:

True if exists.

Return type:

bool

get_client_kwargs(path)

Get base keyword arguments for client for a specific path.

Parameters:path (str) – Absolute path or URL.
Returns:client args
Return type:dict
getctime(path=None, client_kwargs=None, header=None)

Return the creation time of path.

Parameters:
  • path (str) – File path or URL.
  • client_kwargs (dict) – Client arguments.
  • header (dict) – Object header.
Returns:

The number of seconds since the epoch

(see the time module).

Return type:

float

getmtime(path=None, client_kwargs=None, header=None)

Return the time of last access of path.

Parameters:
  • path (str) – File path or URL.
  • client_kwargs (dict) – Client arguments.
  • header (dict) – Object header.
Returns:

The number of seconds since the epoch

(see the time module).

Return type:

float

getsize(path=None, client_kwargs=None, header=None)

Return the size, in bytes, of path.

Parameters:
  • path (str) – File path or URL.
  • client_kwargs (dict) – Client arguments.
  • header (dict) – Object header.
Returns:

Size in bytes.

Return type:

int

head(path=None, client_kwargs=None, header=None)

Returns object HTTP header.

Parameters:
  • path (str) – Path or URL.
  • client_kwargs (dict) – Client arguments.
  • header (dict) – Object header.
Returns:

HTTP header.

Return type:

dict

is_locator(path, relative=False)

Returns True if path refer to a locator.

Depending the storage, locator may be a bucket or container name, a hostname, …

Parameters:
  • path (str) – path or URL.
  • relative (bool) – Path is relative to current root.
Returns:

True if locator.

Return type:

bool

isdir(path=None, client_kwargs=None, virtual_dir=True, assume_exists=None)

Return True if path is an existing directory.

Parameters:
  • path (str) – Path or URL.
  • client_kwargs (dict) – Client arguments.
  • virtual_dir (bool) – If True, checks if directory exists virtually if an object path if not exists as a specific object.
  • assume_exists (bool or None) – This value define the value to return in the case there is no enough permission to determinate the existing status of the file. If set to None, the permission exception is reraised (Default behavior). if set to True or False, return this value.
Returns:

True if directory exists.

Return type:

bool

isfile(path=None, client_kwargs=None, assume_exists=None)

Return True if path is an existing regular file.

Parameters:
  • path (str) – Path or URL.
  • client_kwargs (dict) – Client arguments.
  • assume_exists (bool or None) – This value define the value to return in the case there is no enough permission to determinate the existing status of the file. If set to None, the permission exception is reraised (Default behavior). if set to True or False, return this value.
Returns:

True if file exists.

Return type:

bool

Returns True if object is a symbolic link.

Parameters:
  • path (str) – File path or URL.
  • header (dict) – Object header.
Returns:

True if object is Symlink.

Return type:

bool

list_objects(path='', relative=False, first_level=False, max_request_entries=None)[source]

List objects.

Parameters:
  • path (str) – Path or URL.
  • relative (bool) – Path is relative to current root.
  • first_level (bool) – It True, returns only first level objects. Else, returns full tree.
  • max_request_entries (int) – If specified, maximum entries returned by request.
Returns:

object name str, object header dict

Return type:

generator of tuple

make_dir(path, relative=False)

Make a directory.

Parameters:
  • path (str) – Path or URL.
  • relative (bool) – Path is relative to current root.
relpath(path)

Get path relative to storage.

Parameters:path (str) – Absolute path or URL.
Returns:relative path.
Return type:str
remove(path, relative=False)

Remove an object.

Parameters:
  • path (str) – Path or URL.
  • relative (bool) – Path is relative to current root.
roots

Return URL roots for this storage.

Returns:URL roots
Return type:tuple of str
split_locator(path)

Split the path into a pair (locator, path).

Parameters:path (str) – Absolute path or URL.
Returns:locator, path.
Return type:tuple of str
stat(path=None, client_kwargs=None, header=None)

Get the status of an object.

Parameters:
  • path (str) – File path or URL.
  • client_kwargs (dict) – Client arguments.
  • header (dict) – Object header.
Returns:

Stat result object. Follow the “os.stat_result”

specification and may contain storage dependent extra entries.

Return type:

namedtuple

storage

Storage name

Returns:Storage
Return type:str
storage_parameters

Storage parameters

Returns:Storage parameters
Return type:dict