Pool

class qiime2.core.cache.Pool(cache, name=None, reuse=False)

Pools are folders in the cache that contain many symlinks to many different piece of data. There are two types of pool:

Process Pools: These pools have names of the form <process-id>-<process-create-time>@<user> based on the process that created them. They only exist for the length of the process that created them and ensure data that process is using stays in the cache.

Named Pools: Named pools are keyed just like individual pieces of data. They exist for as long as they have a key, and all of the data they symlink to is retained in the cache for as long as the pool exists.

__init__(cache, name=None, reuse=False)

Used with name=None and reuse=True to create a process pool. Used with a name to create named pools.

Note

In general, you should not invoke this constructor directly and should instead use qiime2.core.cache.Cache.create_pool to create a pool properly on a given cache.

Parameters:
  • cache (Cache) – The cache this pool will be created under.

  • named (str) – The name of the pool we are creating if it is a named pool.

  • reuse (bool) – Whether we will be reusing this pool if it already exists.

Raises:

ValueError – If the pool already exists and reuse is False.

__enter__()

Tells the currently set cache to use this named pool. If there is no cache set then set the cache this named pool is on as well.

Note

If you have already set a cache then you cannot set a named pool that belongs to a different cache.

Raises:
  • ValueError – If you try to set a pool that is not on the currently set cache.

  • ValueError – If you have already set a pool and try to set another.

Examples

>>> test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-')
>>> cache_path = os.path.join(test_dir.name, 'cache')
>>> cache = Cache(cache_path)
>>> pool = cache.create_pool(key='pool')
>>> # When we with in the pool the set cache will be the cache the pool
>>> # belongs to, and the named pool on that cache will be the pool
>>> # we withed in
>>> with pool:
...     current_cache = get_cache()
...     cache.named_pool == pool
True
>>> current_cache == cache
True
>>> # Now that we have exited the with, both cache and pool are unset
>>> get_cache() == cache
False
>>> cache.named_pool == pool
False
>>> test_dir.cleanup()
__exit__(*args)

Unsets the named pool on the currently set cache. If there was no cache set before setting this named pool then unset the cache as well.

Note

self.previously_entered_cache will either be None or the cache this named pool belongs to. It will be None if there was no cache set when we set this named pool. It will be this named pool’s cache if that cache was already set when we set this named pool. If there was a different cache set when we set this named pool, we would have errored in __enter__.

save(ref)

Saves the data into the pool then loads a new ref backed by the data in the pool.

Parameters:

ref (Result) – The QIIME 2 result we are saving into this pool.

Returns:

A QIIME 2 result backed by the data in the cache the pool belongs to.

Return type:

Result

Examples

>>> from qiime2.sdk.result import Artifact
>>> from qiime2.core.testing.type import IntSequence1
>>> test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-')
>>> cache_path = os.path.join(test_dir.name, 'cache')
>>> cache = Cache(cache_path)
>>> pool = cache.create_pool(key='pool')
>>> artifact = Artifact.import_data(IntSequence1, [0, 1, 2])
>>> pool_artifact = pool.save(artifact)
>>> # The data itself resides in the cache this pool belongs to
>>> str(pool_artifact._archiver.path) ==                 str(cache.data / str(artifact.uuid))
True
>>> # The pool now contains a symlink to the data. The symlink is named
>>> # after the uuid of the data.
>>> pool.get_data() == set([str(artifact.uuid)])
True
>>> test_dir.cleanup()
load(ref)

Loads a reference to an element in the pool.

Parameters:

ref (str or Result) – The result we are loading out of this pool, or just its uuid as a string.

Returns:

A result backed by the data in the cache that this pool belongs to.

Return type:

Result

Examples

>>> from qiime2.sdk.result import Artifact
>>> from qiime2.core.testing.type import IntSequence1
>>> test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-')
>>> cache_path = os.path.join(test_dir.name, 'cache')
>>> cache = Cache(cache_path)
>>> pool = cache.create_pool(key='pool')
>>> artifact = Artifact.import_data(IntSequence1, [0, 1, 2])
>>> pool_artifact = pool.save(artifact)
>>> loaded_artifact = pool.load(str(artifact.uuid))
>>> artifact == pool_artifact == loaded_artifact
True
>>> str(loaded_artifact._archiver.path) ==                 str(cache.data / str(artifact.uuid))
True
>>> test_dir.cleanup()
remove(ref)

Removes an element from the pool. The element can be just the uuid of the data as a string, or it can be a Result object referencing the data we are trying to remove.

Parameters:

ref (str or Result) – The result we are removing from this pool, or just its uuid as a string.

Examples

>>> from qiime2.sdk.result import Artifact
>>> from qiime2.core.testing.type import IntSequence1
>>> test_dir = tempfile.TemporaryDirectory(prefix='qiime2-test-temp-')
>>> cache_path = os.path.join(test_dir.name, 'cache')
>>> cache = Cache(cache_path)
>>> pool = cache.create_pool('pool')
>>> artifact = Artifact.import_data(IntSequence1, [0, 1, 2])
>>> pool_artifact = pool.save(artifact)
>>> pool.get_data() == set([str(artifact.uuid)])
True
>>> pool.remove(str(artifact.uuid))
>>> pool.get_data() == set()
True
>>> # Note that the data is still in the cache due to our
>>> # pool_artifact causing the process pool to keep a reference to it
>>> cache.get_data() == set([str(pool_artifact.uuid)])
True
>>> del pool_artifact
>>> # The data is still there even though the reference is gone because
>>> # the cache has not run its own garbage collection yet. For various
>>> # reasons, it is not feasible for us to safely garbage collect the
>>> # cache when a reference in memory is deleted. Note also that
>>> # "artifact" is not backed by the data in the cache, it only lives
>>> # in memory, but it does have the same uuid as "pool_artifact."
>>> cache.get_data() == set([str(artifact.uuid)])
True
>>> cache.garbage_collection()
>>> # Now it is gone
>>> cache.get_data() == set()
True
>>> test_dir.cleanup()
get_data()

Returns a set of all data in the pool.

Returns:

The uuids of all of the data in the pool.

Return type:

set[str]