Development of the OME Data Model¶
Warning
This page is being restructured following the decoupling of the data model from the Bio-Formats code repository. An updated version will be published shortly.
Introduction¶
This is a document describing a way to work and publish the OME model schema
on the OME website, based on observations of the 2016-06 release being
performed; that release version is used in the examples below. Throughout the
process it is important to not just copy and paste, but to understand what is
actually being done and why. The text below is not quite yet a step-by-step
guide, more a set of explanations that should make the necessary steps clear.
Many of the command-line scripts below assume that you start at the top level
of your Bio-Formats repository, and they include some /path/to
directories for you to adjust as appropriate.
Schema development¶
Clean the repository¶
In working with the Bio-Formats git repository, first clean the
unnecessary files away so that they cause no confusing clutter that
wastes your time. From the top-level bioformats
folder, while
ant clean and mvn clean are both fine approaches,
the most thorough is git clean -dfx.
Major or minor release?¶
A minor release of the OME model schema may suffice for changes like adding new legal values to an existing enumeration. A release must be major if,
- some documents that validate under the current release will not validate under the new one (a major “data-level” change)
- some terms in the schema have changed meaning and may thus be acted on differently (a major “information-level” change)
A major release requires changing the schema’s namespace. For a minor release it suffices to increment the value of the version attribute of the xsd:schema element, leaving the namespace unchanged.
Create the new schema directory¶
Note
This subsection is for major releases only. A minor release can reuse the schema directory for the current release, so skip over this part.
For the schema release process a high fraction of the necessary work occurs in Bio-Formats’ components/specification directory. Inside there, components/specification/released-schema contains a subdirectory for each schema, even before its actual release.
To preserve the version history, the creation of the new schema directory is performed across a pair of commits. First, the latest patch gets its new name, for example:
cd components/specification/released-schema
mkdir 2015-01
git mv 2013-10-dev-5/catalog.xml 2015-01
git mv 2013-10-dev-5/*.xsd 2015-01
then, for the subsequent commit, remember to do:
git checkout HEAD^ 2013-10-dev-5/catalog.xml
git checkout HEAD^ 2013-10-dev-5/*.xsd
git add 2013-10-dev-5/*
to restore the released files from the latest patch. In this way, the files of the actually released schema retain their version history.
For an even later commit one may consider:
git rm -r 2013-10-dev-?
which removes the patch versions if no longer desired.
Note
It may make sense to adjust the above git mv commands to
move fewer files to the release directory. For instance,
OMERO.xsd
is not used by the OME schema so need not be
released alongside it in the 2015-01
directory if has not been
changed since the previous release.
Catalog files¶
The released schema directories have catalog files that list their contents. For instance:
cd components/specification/released-schema
find . -name catalog.xml
Within each commit, each catalog file should be kept up to date with changes made in that same directory, such that the catalogs always list exactly the available schema definitions.
XML transforms¶
The changes made to the released schemas should be accompanied by changes to the XML transforms in components/specification/transforms. For major releases use git mv in renaming the upgrade and downgrade for the latest patch. Remember to restore the originals in a later commit, as above when restoring the schema definition files for the latest patch.
For minor releases it suffices to adjust the existing upgrade and downgrade transforms for the current release. Remember that users may be downgrading from an earlier minor version than this newest version.
The transforms’ analog of the catalog files is components/specification/transforms/ome-transforms.xml which should describe the transforms in its directory for that commit.
Search and replace¶
Note
This subsection is for major releases only. A minor release reuses the current release and patch versions, so skip over this part.
There are various references to the latest patch version and even the latest release version to be updated; the whole Bio-Formats repository requires checking.
In replacing the “2013-10-dev-5” schema references within the actual
schema definition files in the new released-schema/2015-01
directory, also update the copyright date in their headers, and the date
in ome.xsd
’s first xsd:documentation tag. Likewise, with the
XML transforms, update the copyright date in their headers, and in the
attributes appearing near the start of
components/specification/transforms/ome-transforms.xml.
Other files in which to fix the schema version include:
- components/autogen/build.properties and ant/xsd-fu.xml for code generation
- the Project Object Model, Maven’s pom.xml
- the components/specification/publish because of the HTML within
- checks in the Bio-Formats code for the latest schema version, including various Java classes (version.equals, SCHEMA_LOCATION, etc.)
Avoid changing:
- sample files in components/specification/samples
- old schema releases
Testing¶
Once the above changes have been made and committed, it is time to test. This requires having various prerequisites installed for Bio-Formats development, including for the C++ implementation. Before each test, clean the repository:
git clean -dfx
ant test
git clean -dfx
mvn test
git clean -dfx
TMPDIR=/tmp/bf-build-`date +%s`
mkdir $TMPDIR
pushd $TMPDIR
cmake `dirs +1`
make
ctest -V
popd
You may care to give make an additional -j
option
specifying the number of cores to use in parallelizing the build. Note
that the ctest step can take a long time.
Sample files¶
OME-XML sample files¶
Once the schemas and transforms are moved and named to fit the release version, then the sample files can be upgraded. A new copy of the sample files is created in a new directory, updated to the new schema using xsltproc with the new transform, then pretty-printed with xmllint or similar. A sufficient command-line approach is:
cd components/specification/samples
for SRC in `find 2015-01 -type f -name '*.ome' -o -name '*.xml'`
do DEST=`echo $SRC | sed -e 's/^2015-01/2016-06/‘`
mkdir -p `dirname $DEST`
<$SRC xsltproc ../transforms/2015-01-to-2016-06.xsl - | xmllint --format - >$DEST
done
The OME-TIFF files require special handling, as they do not have an automatic update tool. First, identify them and copy them to the new directory:
find 2015-01 -name '*.ome.tiff'
cp 2015-01/set-1-meta-companion/*.ome.tiff 2016-06/set-1-meta-companion/
Next, each OME-TIFF file must be edited to have the schema version changed to that of the new release. They are binary files so choice of editor is important; the other non-text data must be preserved. One of several suitable options is Emacs’ Hexl mode.
OME-TIFF sample files¶
Sample files for each schema release version are available under
https://downloads.openmicroscopy.org/images/OME-TIFF/. The sample files in the
previous release’s directory, and the multi-file samples in its
tubhiswt-*
directories, are upgraded to the new schema using
bfconvert from the updated Bio-Formats repository: in that
repository use ant tools to generate the necessary
bioformats_package.jar
Java archive file. The sample files from
the subdirectories are provided also as compressed “zip” archive files.
The files in the bioformats-artificial
subdirectory are
generated by other Bio-Formats classes. Putting these facts together,
setting up the new “2016-06” samples folder is easily achieved:
mkdir 2016-06
mkdir 2016-06/binaryonly
mkdir 2016-06/companion
mkdir 2016-06/modulo
cd 2015-01
for i in *.ome.tif*
do /path/to/bioformats/tools/bfconvert $i ../2016-06/$i
done
cd binaryonly
for i in *.ome.tif*
do /path/to/bioformats/tools/bfconvert $i ../../2016-06/binaryonly/$i
done
cd ../companion
for i in *.ome.tif*
do /path/to/bioformats/tools/bfconvert $i ../../2016-06/companion/$i
done
cd ../modulo
for i in *.ome.tif*
do /path/to/bioformats/tools/bfconvert $i ../../2016-06/modulo/$i
done
for i in tubhiswt-?D
do mkdir ../2016-06/$i
FROM=`ls $i | head -n 1`
TO=`echo $FROM | sed -e 's/_C0/_C%c/ ; s/_TP0/_TP%t/'`
/path/to/bioformats/tools/bfconvert $i/$FROM ../2016-06/$i/$TO
done
cd ../2016-06
for i in tubhiswt-?D ; do zip $i.zip $i/* ; done
mkdir bioformats-artificial
cd bioformats-artificial
BF_PROG=loci.formats.tools.MakeTestOmeTiff /path/to/bioformats/tools/bf.sh
for i in *.ome.tif ; do zip $i.zip $i ; done
Review the new sample files to ensure that they look correct. At the end of the next step they are published online.
Binary Only and companion files: The OMETiffWriter does not support the writing of sample BinaryOnly or Companion files. If the only required update is to change the schema version then the files may be edited with a Hex Editor. Any additional editing may change the length of the file and invalidate the tiff header.
In instances where more detailed changes are required to BinaryOnly samples:
- Write a short program using OMETiffReader and Writer to read and write the existing sample
- Using debugging tools, inject the desired OME XML prior to saveComment in OMETiffWriter close function
- Ensure when modifying the XML that the UUID values are correct
- Verify that files pass using xmlvalid and tiffinfo commands
Schema publication¶
Schema release¶
Once a specification change has been made into an ome-model release, the publish script in the https://github.com/ome/schemas repository automatically generates new schemas pages published at https://www.openmicroscopy.org/Schemas/.
Generated documentation¶
Documentation for the released schema must be generated from the
ome.xsd
definition file. The XML editor oXygen is recommended for
this task, and requires the schema definitions to have been published online
as described above. To build the generated documentation for a given release:
/Applications/oxygen/schemaDocumentationMac.sh https://www.openmicroscopy.org/Schemas/OME/$RELEASE/ome.xsd -cfg:components/specification/omeOxygenDocConfig.xml
Check that the documentation generated in the new output
directory all looks correct.
The SCHEMA-documentation job will generate the oXygen
documentation for a given version of the schema. Once generated, this documentation can be transferred to a $RELEASE
subfolder of
/var/www/html/www.openmicroscopy.org/specification/schema_doc
on web-prod.