Ocata Series Release Notes


Bug Fixes

  • Fixed deadlock when logging from a tpool thread. The object server runs certain IO-intensive methods outside the main pthread for performance. Previously, if one of those methods tried to log, this can cause a crash that eventually leads to an object server with hundreds or thousands of greenthreads, all deadlocked. The fix is to use a mutex that works across different greenlets and different pthreads.

  • Fixed a rare issue where multiple backend timeouts could result in bad data being returned to the client.

  • Removed a race condition where a POST to an SLO could modify the X-Static-Large-Object metadata.

  • Fixed a cache invalidation issue related to GET and PUT requests to containers that would occasionally cause object PUTs to a container to 404 after the container had been successfully created.


Bug Fixes

  • Fixed a bug in the EC reconstructor where an unsuccessful sync would cause extra disk I/O load on the remote server. Now the extra checking work is only requested if the sync request was successful.

  • Fixed error where a container drive error resulted in double space usage on rest drives. When drive with container or account database is unmounted, the bug would create handoff replicas on all remaining drives, increasing the drive space used and filling the cluster.

  • Fixed some minor test compatibility issues.

  • Updated docs to reference appropriate ports.


New Features

  • Improved performance by eliminating an unneeded directory structure hash.

  • Optimized the common case for hashing filesystem trees, thus eliminating a lot of extraneous disk I/O.

  • Updated the hashes.pkl file format to include timestamp information for race detection. Also simplified hashing logic to prevent race conditions and optimize for the common case.

  • The erasure code reconstructor will now shuffle work jobs across all disks instead of going disk-by-disk. This eliminates single-disk I/O contention and allows continued scaling as concurrency is increased.

  • Erasure code reconstruction handles moving data from handoff nodes better. Instead of moving the data to another handoff, it waits until it can be moved to a primary node.

  • Temporary URLs now support one common form of ISO 8601 timestamps in addition to Unix seconds-since-epoch timestamps. The ISO 8601 format accepted is ‘%Y-%m-%dT%H:%M:%SZ’. This makes TempURLs more user-friendly to produce and consume.

  • Listing containers in accounts with json or xml now includes a last_modified time. This does not change any on-disk data, but simply exposes the value to offer consistency with the object listings on containers.

  • I/O priority is now supported on AArch64 architecture.

Upgrade Notes

  • If you upgrade and roll back, you must delete all hashes.pkl files.

Deprecation Notes

  • If using erasure coding with ISA-L in rs_vand mode and 5 or more parity fragments, Swift will emit a warning. This is a configuration that is known to harm data durability. In a future release, this warning will be upgraded to an error unless the policy is marked as deprecated. All data in an erasure code storage policy using isa_l_rs_vand with 5 or more parity should be migrated as soon as possible. Please see https://bugs.launchpad.net/swift/+bug/1639691 for more information.

  • The erasure code reconstructor handoffs_first option has been deprecated in favor of handoffs_only. handoffs_only is far more useful, and just like handoffs_first mode in the replicator, it gives the operator the option of forcing the consistency engine to focus solely on revert (handoff) jobs, thus improving the speed of rebalances. The handoffs_only behavior is somewhat consistent with the replicator’s handoffs_first option (any error on any handoff in the replicator will make it essentially handoff only forever) but the handoff_only option does what you want and is named correctly in the reconstructor.

  • The default for object_post_as_copy has been changed to False. The option is now deprecated and will be removed in a future release. If your cluster is still running with post-as-copy enabled, please update it to use the “fast-post” method. Future versions of Swift will not support post-as-copy, and future features will not be supported under post-as-copy. (“Fast-post” is where object_post_as_copy is false).

Bug Fixes

  • Fixed a bug where the ring builder would not allow removal of a device when min_part_seconds_left was greater than zero.

  • Fixed a bug where an SLO download with a range request may have resulted in a 5xx series response.

  • SLO manifest PUT requests can now be properly validated by sending an ETag header of the md5 sum of the concatenated md5 sums of the referenced segments.

  • Fixed the stats calculation in the erasure code reconstructor.

  • Rings with min_part_hours set to zero will now only move one partition replica per rebalance, thus matching behavior when min_part_hours is greater than zero.

Other Notes

  • Various other minor bug fixes and improvements.


New Features

  • Ring files now include byteorder information about the endian of the machine used to generate the file, and the values are appropriately byteswapped if deserialized on a machine with a different endianness. Newly created ring files will be byteorder agnostic, but previously generated ring files will still fail on different endian architectures. Regenerating older ring files will cause them to become byteorder agnostic. The regeneration of the ring files will not cause any new data movement. Newer ring files will still be usable by older versions of Swift (on machines with the same endianness–this maintains existing behavior).

  • All 416 responses will now include a Content-Range header with an unsatisfied-range value. This allows the caller to know the valid range request value for an object.

  • TempURLs now support a validation against a common prefix. A prefix-based signature grants access to all objects which share the same prefix. This avoids the creation of a large amount of signatures, when a whole container or pseudofolder is shared.

  • In SLO manifests, the etag and size_bytes keys are now fully optional and not required. Previously, the keys needed to exist but the values were optional. The only required key is path.

  • Respect server type for –md5 check in swift-recon.

Bug Fixes

  • Correctly handle deleted files with if-none-match requests.

  • Correctly send 412 Precondition Failed if a user sends an invalid copy destination. Previously Swift would send a 500 Internal Server Error.

  • Fixed a rare infinite loop in swift-ring-builder while placing parts.

  • Ensure update of the container by object-updater, removing a rare possibility that objects would never be added to a container listing.

  • Fixed non-deterministic suffix updates in hashes.pkl where a partition may be updated much less often than expected.

  • Fixed regression in consolidate_hashes that occurred when a new file was stored to new suffix to a non-empty partition. This bug was introduced in 2.7.0 and could cause an increase in rsync replication stats during and after upgrade, due to inconsistent hashing of partition suffixes.

  • Account and container databases will now be quarantined if the database schema has been corrupted.

  • Remove empty db hash and suffix directories if a db gets quarantined.

Other Notes

  • Removed “in-process-” from func env tox name to work with upstream CI.

  • Various other minor bug fixes and improvements.


New Features

  • The improvements to EC reads made in Swift 2.10.0 have also been applied to the reconstructor. This allows fragments to be rebuilt in more circumstances, resulting in faster recovery from failures.

  • Instead of using a separate .durable file to indicate the durable status of an EC fragment archive, we rename the .data to include a durable marker in the filename. This saves one inode for every EC .data file. Existing .durable files will not be removed, and they will continue to work just fine.

  • Closed a bug where ssync may have written bad fragment data in some circumstances. A check was added to ensure the correct number of bytes is written for a fragment before finalizing the write. Also, erasure coded fragment metadata will now be validated on read requests and, if bad data is found, the fragment will be quarantined.

  • Added a configurable URL base to staticweb.

  • Support multi-range GETs for static large objects.

  • TempURLs using the “inline” parameter can now also set the “filename” parameter. Both are used in the Content-Disposition response header.

  • Mirror X-Trans-Id to X-Openstack-Request-Id.

  • SLO will now concurrently HEAD segments, resulting in much faster manifest validation and object creation. By default, two HEAD requests will be done at a time, but this can be changed by the operator via the new concurrency setting in the “[filter:slo]” section of the proxy server config.

  • Suppressed the KeyError message when auditor finds an expired object.

  • Daemons using InternalClient can now be properly killed with SIGTERM.

  • Added a “user” option to the drive-audit config file. Its value is used to set the owner of the drive-audit recon cache.

  • Throttle update_auditor_status calls so it updates no more than once per minute.

  • Suppress unexpected-file warnings for rsync temp files.

Upgrade Notes

  • Updated the PyECLib dependency to 1.3.1.

  • Note that after writing EC data with Swift 2.11.0 or later, that data will not be accessible to earlier versions of Swift.

Critical Issues

  • WARNING: If you are using the ISA-L library for erasure codes, please upgrade to liberasurecode 1.3.1 (or later) as soon as possible. If you are using isa_l_rs_vand with more than 4 parity, please read https://bugs.launchpad.net/swift/+bug/1639691 and take necessary action.

Other Notes

  • Various other minor bug fixes and improvements.