The C++ codebase has been primarily a conversion of the original Java codebase, with some additional helper functions and classes added where needed. The intention is that the basic interfaces and classes should be identical between the two languages unless this is prevented by fundamental differences between the languages.
This section is intended to be useful for
In addition to documenting the specific language and class compatibility issues, this section also documents the idioms in use in the C++ code which might not be immediately clear by looking at the API reference, and which may not be familiar to Java developers.
While C++ and Java have some basic syntactical similarities, there are several basic differences in their type systems.
Java has primitive types and classes.
int i;
double d;
Pixels pixels = new Pixels();
Pixels[] array = new Pixels[5];
C++ has primitive types, structures and classes.
int16_t i1;
uint32_t i2;
double d;
// Allocate on the stack, or as a struct or class member:
Pixels pixels;
// Allocate on the heap
Pixels *pixelsptr1 = new Pixels();
// Pointer to existing instance
const Pixels *pixelsptr2 = &pixels;
// Reference to existing instance
Pixels& pixelsref(pixels);
Pixels array[5];
typedef is used to create an alias for an existing type.
typedef std::vector<std::string> string_list;
string_list l;
string_list::const_iterator i = l.begin();
// NOT std::vector<std::string>::const_iterator
typedef std::vector<Pixels> plist;
plist pl(6);
plist::size_type idx = 2;
// size_type NOT unsigned int or uint32_t
pl.at(idx) = ...;
Used in standard container types e.g. size_type, value_type and in classes and class templates. Consistency is needed for generic programming—use the standard type names to enable interoperability with standard algorithms.
throws details which exceptions are thrown by a method. Java exceptions are also “checked”, requiring the caller to catch and handle all exceptions which might be thrown, aside from RuntimeException and its subclasses.
C++ has exception specifications like Java, however they are useless aside from nothrow. This is because if an exception is thrown which does not match the specification, it will abort the program with a bad_exception which makes them unusable in practice.
Exceptions can be thrown at any point with the exception that they should never be thrown in a destructor. It is not necessary or typical to check exceptions except where needed. All code must be exception-safe given that an exception could be thrown at any point; the design considerations for exception safety are covered below.
Java supports single-inheritance, plus interfaces. C++ supports true multiple-inheritance, which is rather more flexible, at the expense of being rather more complicated and dangerous. However, the Java single-inheritance-plus-interfaces model can be implemented in C++ using a subset of the facilities provided by multiple inheritance. Rather than being enforced by the language, it is a set of idioms. These must be rigorously followed or else things will fail horribly!
C++ interfaces are classes with:
C++ classes implementing interfaces:
When compiled with optimization enabled, the interface classes should have zero storage overhead. If implementing classes do not use virtual public inheritance, compilation will fail as soon as a second class in the inheritance hierarchy also implements the interface.
Plain (or “dumb”) C++ pointers can be dangerous if used incorrectly. The OME-Files API make a point of never using them unless absolutely necessary. For automatic objects allocated on the stack, allocation and deallocation is automatic and safe:
{
Image i(filename);
i.read_plane();
// Object destroyed when i goes out of scope
}
In this case, the object’s destructor was run and the memory freed automatically.
Looking at the case where a pointer is used to reference manually-allocated memory on the heap:
{
Image *i = new Image(filename);
i->read_plane();
// Memory not freed when pointer i goes out of scope
}
In this case new was not paired with the corresponding delete, resulting in a memory leak. This is the code with the “leak” fixed:
{
Image *i = new Image(filename);
i->read_plane(); // throws exception; memory leaked
delete i; // never called
}
new and delete are now paired, but the code is not exception-safe. If an exception is thrown, memory will still be leaked. Manual memory management requires correct clean up for every exit point in the function, including both all return statements and thrown exceptions. Here, we handle this correctly:
{
Image *i = new Image(filename);
try {
i->read_plane(); // throws exception
} catch (const std::runtime_error& e) {
delete i; // clean up
throw; // rethrow
}
delete i; // never called for exceptions
}
However, this does not scale. This is painful and error prone when scaled to an entire codebase. Even within this simple function, there is only a single variable with a single exception and single return to deal with. Imagine the combinatorial explosion when there are several variables with different lifetimes and scopes, multiple return points and several exceptions to handle–this is easy to get wrong, so a more robust approach is needed.
Use of new is not in the general case safe or sensible. The OME-Files API never passes pointers allocated with new, nor requires any manual memory management. Instead, “smart” pointers are used throughout to manage memory safely and automatically.
Resource Acquisition Is Initialization (RAII) is a programming idiom used throughout modern C++ libraries and applications, including the Standard Library,
Because this relies implicitly upon the deterministic object destruction guarantees made by the C++ language, this is not used widely in Java APIs which often require manual management of resources such as open files. Used carefully, RAII will prevent resource leaks and result in robust, safe code.
The FormatReader API is currently not using RAII due to the use of the FormatHandler::setId() interface.
// Non-constant Constant
// ----------------------------- --------------------------------------
// Pointer
Image *i; const Image *i;
Image * const i; const Image * const i;
// Reference
Image& i; const Image& i;
// Shared pointer
std::shared_ptr<Image> i; std::shared_ptr<const Image> i;
const std::shared_ptr<Image> i; const std::shared_ptr<const Image> i;
// Shared pointer reference
std::shared_ptr<Image>& i; std::shared_ptr<const Image>& i;
const std::shared_ptr<Image>& i; const std::shared_ptr<const Image>& i;
// Weak pointer
std::weak_ptr<Image> i; std::weak_ptr<const Image> i;
const std::weak_ptr<Image> i; const std::weak_ptr<const Image> i;
// Weak pointer reference
std::weak_ptr<Image>& i; std::weak_ptr<const Image>& i;
const std::weak_ptr<Image>& i; const std::weak_ptr<const Image>& i;
Java has one reference type. Here, we have 22. Clearly, not all of these will typically be used. Below, a subset of these are shown for use for particular purposes.
Class member types:
Image i; // Concrete instance
std::shared_ptr<Image> i; // Reference
std::weak_ptr<Image> i; // Weak reference
Wherever possible, a concrete instance should be preferred. This is not possible for polymorphic types, where a reference is required. In this situation, an std::shared_ptr is preferred if the class owns the member and/or needs control over its lifetime. If the class does not have ownership then an std::weak_ptr will allow safe access to the object if it still exists. In circumstances where manual lifetime management is required, e.g. for performance, and the member is guaranteed to exist for the duration of the object’s lifetime, a plain pointer or reference may be used. A pointer will be used if it is possible for it to be null, or it may be reassigned more than once, or if is assigned after initial construction. If properly using RAII, using references should be possible and preferred over bare pointers in all cases.
Argument types:
// Ownership retained
void read_plane(const Image& image);
// Ownership shared or transferred
void read_plane(const std::shared_ptr<Image>& image);
Passing primitive types by value is acceptable. However, passing a struct or class by value will implicitly copy the object into the callee’s stack frame, which may be expensive (and requires a copy constructor which will not be guaranteed or even possible for polymorphic types). Passing by reference avoids the need for any copying, and passing by const reference will prevent the callee from modifying the object, also making it clear that there is no transfer of ownership. Passing using an std::shared_ptr is possible but not recommended—the copy will involve reference counting overhead which can kill multi-threaded performance since it requires synchronization between all threads; use a const reference to an std::shared_ptr to avoid the overhead. If ownership should be transferred or shared with the callee, use a non-const reference.
To be absolutely clear, plain pointers are never used and are not acceptable for ownership transfer. A plain reference also makes it clear there is no ownership transfer.
Return types:
Image get_image(); // Ownership transferred
Image& get_image(); // Ownership retained
std::shared_ptr<Image> get_image(); // Ownership shared/trans
If the callee does not retain a copy of the original object, it can’t pass by reference since it can’t guarantee the object remaining in scope after it returns, hence it must create a temporary value and pass by value. If the callee does retain a copy, it has the option of passing by reference. Passing by reference is preferred when possible. Passing by value implies ownership transfer. Passing by reference implies ownership retention. Passing an std::shared_ptr by value or reference implies sharing ownership since the caller can retain a reference; if passing by value ownership may be transferred since this implies the callee is not retaining a reference to it (but this is not guaranteed).
Again, to be absolutely clear, plain pointers are never used and are not acceptable for ownership transfer. A plain reference also makes it clear there is no ownership transfer.
C++ arrays are not safe to pass in or out of functions since the size is not known unless passed separately.
class Image
{
// Unsafe; size unknown
uint8_t[] getLUT();
void setLUT(uint8_t[]& lut);
};
C++ arrays “decay” to “bare” pointers, and pointers have no associated size information. std::array is a safe alternative.
class Image
{
typedef std::array<uint8_t, 256> LUT;
// Safe; size defined
const LUT& getLUT() const;
void setLUT(const LUT&);
};
std::array is a array-like object (a class which behaves like an array). Its type and size are defined in the template, and it may be passed around like any other object. Its array::at() method provides strict bounds checking, while its index array::operator[]() provides unchecked access.
Pixel data is handled differently between the Java and C++ implementations. The primary reason for the difference is that the Java code uses raw byte[] arrays to contain pixel data. This could not be implemented in C++ due to the limitations of C++ arrays discussed above, as well as having a number of additional limitations:
The solution was to create a dedicated PixelBuffer template class which could represent pixels of any type. This is contained by a VariantPixelBuffer class which can contain any of the supported pixel types. This is therefore both flexible and strongly-typed. The C++ code is slightly more complex as a result, but it is safer, and the buffer can be passed around without the need for any additional metadata to describe its type, size and ordering. This can make passing pixel data between different libraries much more transparent.
Additional differences include the semantics of how the FormatReader::openBytes() and FormatWriter::saveBytes() methods are implemented. The API is the same, but the default behavior is a little different. All well-written code should cope with the differences, but code making assumptions may require some attention.
The Java TIFF reader classes’ FormatReader::isInterleaved() method will always return false, irrespective of the TIFF PlanarConfiguration tag. As a result, FormatReader::openBytes() will always return pixel data with samples on separate contiguous planes. In contrast, the C++ TIFF reader classes’ FormatReader::isInterleaved() method will return true if the TIFF PlanarConfiguration is CONTIG and false if SEPARATE, and the FormatReader::openBytes() method will return pixel data with the appropriate interleaving, matching the same format in the TIFF file. The Java behavior is due to the implementation details of its TIFF reading code; the C++ code uses libtiff and simply passes back the pixel data without any rearrangement. Java code which assumes it will never receive interleaved data will need to be updated to cope with it when porting to C++.
The Java TIFF writer will always set interleaving if the number of samples per pixel is one (which is the recommended behaviour), overriding FormatWriter::setInterleaved(). The C++ TIFF writer will always set interleaving based upon FormatWriter::setInterleaved(), and will not override the request of the caller. This discrepancy will be rectified in a future release to match the behavior of the Java reader; in practice there is no difference in the pixel ordering since interleaving is irrelevant when there is only one sample per pixel.
To obtain the Java TIFF reader behavior in C++, i.e. to obtain non-interleaved pixel data, create a VariantPixelBuffer with the desired pixel type and interleaving (use the PixelBufferBase::make_storage_order() helper method to create the dimension order without interleaving), and then assign the buffer filled by FormatRead::openBytes() to this buffer; the data will be transparently converted to the desired ordering on assignment.