FS configuration options ======================== Background ---------- Users import their image files to the OMERO.fs server. The contents of these files are kept intact by the server and the import process preserves the files' path and name (at least within the rules of :property:`omero.fs.repo.path_rules` below), so that OMERO.fs can become a trusted repository for the master copy of users' data. While the default server configuration from :doc:`config` should typically suffice, :program:`omero config set` may be used to adjust settings related to file uploads. These settings are explained below. Repository location ------------------- Several properties determine where FS-imported files are stored: - :property:`omero.data.dir` - singleton property (i.e. once globally) which points to the legacy repository location for OMERO. For OMERO to run on multiple systems, the contents of this directory must be on a shared volume. - :property:`omero.managed.dir` - singleton property which points to the default ``ManagedRepository``. In an OMERO install in which there is only one Blitz server, this will be the only repository. This need not be located under :property:`omero.data.dir` but is by default. - ``omero.repo.dir`` (experimental) - value passed to all non-legacy, standalone repositories. This is not actively used, but would allow hosting repositories on multiple physical systems without the need for a shared volume. For example, after running ``bin/omero admin start`` on the main machine, it would be possible to launch nodes on various machines via ``bin/omero node start fs-B``, ``bin/omero node start fs-C``, etc. Each of these would pass a different ``omero.repo.dir`` value to its process. Template path ------------- When files are uploaded to the managed repository, a parent directory is created to receive the upload. A multi-file image has all its files stored in the same parent directory, though they may be in different subdirectories of that parent to mirror the original directory structure before upload. The :property:`omero.fs.repo.path` setting defines the creation of that parent directory. It is this value which makes the ``ManagedRepository`` “managed”. Path naming constraints ^^^^^^^^^^^^^^^^^^^^^^^ There is some flexibility in how this parent directory is named. The constraints are: * The path components (individual directories in the path) must be separated by :literal:`/` characters. * A path component separator may be written as :literal:`//` only if followed by at least one more path component. In this case: * The server ensures that the path components preceding the :literal:`//` are owned by the :literal:`root` user. * Any newly created path components following the :literal:`//` are **owned by the user** who owns the images. * If no :literal:`//` is present then *all* newly created path components are **owned by the user** who owns the images. * The path must be unique for each import. It is for this reason that the :literal:`%time%` term expands to a time with millisecond resolution. * To avoid confusion with the expansion terms enumerated below, avoid other uses of the :literal:`%` character in path components. In the above, ownership of path components is in the context of OMERO users accessing the OMERO managed repository through its API. It does not relate to operating system users' permissions for the underlying filesystem. Expansion terms ^^^^^^^^^^^^^^^ Special terms may be used within path components: these are replaced with text that depends on the import. For any directory in the template path """""""""""""""""""""""""""""""""""""" :literal:`%userId%` expands to the user's numerical ID :literal:`%user%` expands to the user's name :literal:`%institution%` expands to the user's institution name; this path component is wholly omitted if the user has no institution set :literal:`%institution:default%` expands to the user's institution name, or to the supplied "default" if the user has no institution set; for instance, :literal:`%institution:State College of Florida, Manatee-Sarasota%` is permitted :literal:`%groupId%` expands to the OMERO group's numerical ID :literal:`%group%` expands to the OMERO group's name :literal:`%perms%` expands to the group's six-character permissions string, for example :literal:`rw----` for a private group :literal:`%year%` expands to the current year number, for example :literal:`2014` :literal:`%month%` expands to the current month number, zero-padded, for example :literal:`08` :literal:`%monthname%` expands to the current month name, for example :literal:`August` :literal:`%day%` expands to the current day number in the month, zero-padded, for example :literal:`04` :literal:`%sessionId%` expands to the session's numerical ID :literal:`%session%` expands to the session key (UUID) of the session, for example :literal:`6c2dae43-cfad-48ce-af6f-025569f9e6df` For user-owned directories only """"""""""""""""""""""""""""""" These expansion terms may not precede :literal:`//` in the template path. :literal:`%time%` expands to the current time, in hours, minutes, seconds, milliseconds, for example :literal:`13-49-07.727` :literal:`%hash%` expands to an eight-digit hexadecimal hash code that is constant for the set of files being imported, for example :literal:`0554E3A1` :literal:`%hash:digits%` expands as :literal:`%hash%`, where :literal:`digits` is a comma-separated list of how many digits of the hash to use in different subdirectories; for example, :literal:`hash-%hash:3,3,2%` expands to a form like :literal:`hash-123/456/78` :literal:`%increment%` expands to an integer that increases consecutively so as to create the next new directory, for example using :literal:`inc-%increment%` with preexisting directories up to :literal:`inc-24` would expand to :literal:`inc-25` :literal:`%increment:digits%` expands as :literal:`%increment%` where :literal:`digits` specifies a minimum length to which to zero-pad the integer, for example using :literal:`inc-%increment:3%` with preexisting directories up to :literal:`inc-024` would expand to :literal:`inc-025` :literal:`%subdirs%` expands to nothing until the preceding directory has more than one thousand entries, in which case it expands to an integer that increases consecutively to similarly limit the entry count in subdirectories; applies recursively to extend the number of path components as needed, so, using :literal:`example/below-%subdirs%` in the path, with :literal:`example/below-000` to :literal:`example/below-999` all "full", three-digit subdirectories below those are created, such as :literal:`example/below-123/456` :literal:`%subdirs:digits%` expands as :literal:`%subdirs%` where :literal:`digits` specifies to how many digits :literal:`%subdirs%` may expand for each path component: for example, :literal:`example/%subdirs:4%-below` allows ten thousand directory entries in :literal:`example` before creating :literal:`example/1234-below` and, much later, :literal:`example/1234-below/5678` No more than one of either :literal:`%subdirs%` or :literal:`%increment%` may be used in any one path component, although they may each be used many times in the whole path. Legal file names ---------------- Although OMERO.fs attempts to preserve file naming, the server's operating system or file system is likely to somehow constrain what file names may be stored by OMERO.fs. This is of particular concern when a user may upload from a more permissive system to a server on a less permissive system, or when it is anticipated that the server itself may be migrated to a less permissive system. The server never accepts Unicode control characters in file names. The :property:`omero.fs.repo.path_rules` setting defines the combination of restrictions that the server must apply in accepting file uploads. The restrictions are grouped into named sets: :literal:`Windows required` prohibits names with the characters :literal:`"`, :literal:`*`, :literal:`/`, :literal:`:`, :literal:`<`, :literal:`>`, :literal:`?`, :literal:`\\`, :literal:`|`, names beginning with :literal:`$`, the names :literal:`AUX`, :literal:`CLOCK$`, :literal:`CON`, :literal:`NUL`, :literal:`PRN`, :literal:`COM1` to :literal:`COM9`, :literal:`LPT1` to :literal:`LPT9`, and anything beginning with one of those names followed by :literal:`.` :literal:`Windows optional` prohibits names ending with :literal:`.` or a space :literal:`UNIX required` prohibits names with the character :literal:`/` :literal:`UNIX optional` prohibits names beginning with :literal:`.` or :literal:`-` These rules are applied to each separate path component of the file name on the client's system. So, for instance, an upload of a file :literal:`/tmp/myfile.tif` from a Linux system would satisfy the :literal:`UNIX required` restrictions because neither of the path components :literal:`tmp` and :literal:`myfile.tif` contains a :literal:`/` character. Applying the "optional" restrictions does not assist OMERO.fs at all; those restrictions are designed to ease manual maintenance of the directory specified by the :property:`omero.managed.dir` setting, being where the server stores users' uploaded files. Checksum algorithm ------------------ As the client uploads each file to the server, it calculates a checksum for the file. After the upload is complete the client reports that checksum to the server. The server then calculates the checksum for the corresponding file from its local filesystem and checks that it matches what the client reported. **File integrity** is thus **assured** because corruption during transmission or writing would be revealed by a checksum mismatch. There are various algorithms by which checksums may be calculated. The list of available algorithms is given by :property:`omero.checksum.supported`. To calculate comparable checksums the client and server use the same algorithm. The server API permits clients to specify the algorithm, but it is expected that they will typically accept the server default. The number that suffixes each of the checksum algorithm names specifies the bit width of the resulting checksum. A larger bit width makes it less likely that different files will have the same checksum by coincidence, but lengthens the checksum hex strings that are reported to the user and stored in the :literal:`hash` column of the :literal:`originalfile` table in the database.