Yoga Series Release Notes


New Features

  • Now users can upload Python code through the API (code_sources API) and create actions from it dynamically (using dynamic_actions API). If needed, actions can be also modified and deleted. Note that this all doesn’t require a Mistral restart.

  • Added a new endpoint “/v2/code_sources/”, this is used to create, update, delete and get code sources from mistral.

  • Added a new endpoint “/v2/dynamic_actions/”, this is used to create, update, delete and get dynamic actions from mistral runtime.

Upgrade Notes

  • The default value of [oslo_policy] policy_file config option has been changed from policy.json to policy.yaml. Operators who are utilizing customized or previously generated static policy JSON files (which are not needed by default), should generate new policy files or convert them in YAML format. Use the oslopolicy-convert-json-to-yaml tool to convert a JSON to YAML formatted policy file in backward compatible way.

Deprecation Notes

  • Use of JSON policy files was deprecated by the oslo.policy library during the Victoria development cycle. As a result, this deprecation is being noted in the Wallaby cycle with an anticipated future removal of support by oslo.policy. As such operators will need to convert to YAML policy files. Please see the upgrade notes for details on migration of any custom policy files.


New Features

  • There has been a big change in the Mistral action management. All Mistral subsystems don’t access the database directly if they need to work with action definitions. Instead, they work action providers registered in the new entry point “mistral.action.providers”. All action providers need to implement the base class ActionProvider declared in “mistral-lib” starting with the version 2.3.0. Action providers are responsible for delivering so-called action descriptors that carry all the most important information about particular actions like “name”, “description”, names of input parameters and so on. The entire system has now been refactored with action providers. Using this new mechanism it’s now possible to deliver actions into the system dynamically w/o having to reboot Mistral. We just need to come up with an action provider implementation that can do that and register it in the entry point from any Python project installed on the same Python environment. This approach also means that actions don’t have to be stored in the database anymore. It fully depends on a particular action provider how to store action descriptors and how to perform lookup. It is possible to create action providers fetching information about actions over HTTP, AMQP and potentially any other protocol. Additionally, switching to action providers allowed to make engine code much cleaner and more encapsulated. For example, ad-hoc actions are no longer a concern of the Mistral engine. Instead of implementing all ad-hoc action logic in the engine there’s now a special action provider fully responsible for ad-hoc actions. The detailed documentation on using action providers will be added soon.

Upgrade Notes

  • As part of the transition to action providers the compatibility of the /actions REST API endpoint has been broken to some extent. For ad-hoc actions it remained almost the same. We still can do all CRUD operations upon them. However, all the standard actions (prefixed with “std.”) are not stored in the DB anymore. For that reason this type of actions doesn’t have IDs anymore, and generally actions are not identified by IDs anymore, only by name. This change needs to be taken into account when updating to this version of Mistral.


New Features

  • Mistral engine now supports graceful scale-in. That is, if a number of engines in a cluster needs to be reduced manually it is now possible to do w/o breaking currently running workflows. In order to shutdown a Mistral engine, SIGTERM signal needs to be sent to the corresponding process. In Unix operatating systems it’s a matter of running the command “kill <engine PID>” in any shell. When this signal is caught by the process, it has a certain amount of time configured by the ‘graceful_shutdown_timeout’ property to complete currently running database transactions and process all buffered RPC messages that have already been polled from the queue. After this time elapses, the process will be forced to exit. By default, the value of the ‘graceful_shutdown_timeout’ property is 60 (seconds).

  • Move Mistral actions for OpenStack to mistral-extra library

  • Add support for creating ad-hoc actions in a namespace. Creating actions with same name is now possible inside the same project now. This feature is backward compatible.

    All existing actions are assumed to be in the default namespace, represented by an empty string. Also, if an action is created without a namespace specified, it is assumed to be in the default namespace.

    If an ad-hoc action is created inside a workbook, then the namespace of the workbook would be also it’s namespace.

  • Added a new API to fetch sub-executions of an execution or a task.

Upgrade Notes

  • Python 2.7 support has been dropped. Last release of mistral to support python 2.7 is OpenStack Train. The minimum version of Python now supported by mistral is Python 3.6.

Bug Fixes

  • Added the configuration option “convert_output_data” in the “yaql” group. This option, if set to False, allows to disable YAQL expression result conversion. This fixes performance issues for lots of use cases where a result of an expression is a big data structure. In fact, YAQL created a copy of this data structure every time before giving it to Mistral. This option can’t be set to False when the corresponding “convert_input_data” is True. Otherwise, it doesn’t work correctly. By default, the value of “convert_output_data” is True which keeps backwards compatibility.

  • Fix error validate token when run cron trigger. The problem is that a trust client can’t do validate token when run cron trigger.

  • Some users rely on the presence of the root error related to running an action and it’s not convenient that it is now in the end of the string, e.g. if we look at the corresponding task execution “state_info” field. Now a cause error message is included in the beginning of the resulting error string returned by the action executor so that it’s clearly visible. This message can be also truncated in some cases (depending on the config option) so we need to make sure we keep the cause error message.

New Features

  • The new configuration option “validation_mode” was added. It can take one of the values: “enabled”, “mandatory”, “disabled”. If it is set to “enabled” then Mistral will be validating the workflow language syntax for all API operations that create or update workflows (either via /v2/workflows or /v2/workbooks) unless it’s explicitly disabled with the API parameter “skip_validation” that has now been added to the corresponding API endpoints. The “skip_validation” parameter doesn’t have to have any value since it’s a boolean flag. If the configuration option “validation_mode” is set to “mandatory” then Mistral will be always validating the syntax of all workflows for the mentioned operations. If set to “disabled” then validation will always be skipped. Note that if validation is disabled (one way or another) then there’s a risk of breaking a workflow unexpectedly while it’s running or getting another an unexpected error when uploading it possible w/o a user-friendly description of the error.

New Features

  • This makes getting a root_execution_id available to the jinja execution object. Before this it was only possible to get that through filtering and querying the executions search.

  • Added HTTPProxyToWSGI middleware in front of the Mistral API. The purpose of this middleware is to set up the request URL correctly in the case there is a proxy (for instance, a loadbalancer such as HAProxy) in front of the Mistral API. The HTTPProxyToWSGI is off by default and needs to be enabled via a configuration value. Fixes [bug 1590608] Fixes [bug 1816364]

  • Adds the parameter private_key in the standard ssh actions. This allows a user to specify the key to use instead of using the ones available in the filesystem of the executors.

  • It’s now possible to add reply-to address when sending email.

Bug Fixes

  • Added the “convert_input_data” config property under the “yaql” group. By default it’s set to True which preserves the current behavior so there’s no risk with compatibility. If set to False, it disables the additional data conversion that was initially added to support some tricky cases like working with sets of dicts (although dict is not a hashable type and can’t be put into a set). Disabling it give a significant performance boost in cases when data contexts are very large.

  • There was a weird typo in the list generator expression made in that led to calculating a field value in the wrong way. Fixed. The added test was previously failing.

  • “__task_execution” wasn’t always included into the expression data context so the function task() didn’t work properly. Fixes [bug 1823875]

  • For an ad-hoc action, preparing input for its base action was done more than once. It happened during the validation phase and the scheduling phase. However, input preparation may be expensive in case of heavy expressions and data contexts. This has now been fixed by caching a prepared input within an AdHocAction instance.

  • WorkflowExecution database model had only “root_execution_id” to reference a root workflow execution, i.e. the most parent workflow execution in the execution tree. So if we needed to get an entity itself we’d always make a direct query to the database, in fact, w/o using an entity cache in the SQLAlchemy session. It’s now been fixed by adding a normal mapped entity for root workflow execution. In other words, WorkflowExecution class now has the property “root_execution”. It slightly improves performance in case this property is accessed more than once per the database session.

Bug Fixes

  • [bug 1715848] Fixed a bug that prevents event-engines to work correctly in HA.

Upgrade Notes

  • Run mistral-db-manage --config-file <mistral-conf-file> upgrade head to ensure the database schema is up-to-date.

Bug Fixes

  • Mistral doesn’t log enough info about sending actions to executor and receiving them on the executor side. It makes it hard to debug situations when an action got stuck in RUNNING state. It has now been fixed by adding additional log statements.

  • Fixed a backward compatibility issue: there was a change made in Rocky that disallowed the ‘params’ property of a workflow execution to be None when one wants to start a workflow.

  • Cleanup transports along RPC clients. Fixed a bad weird condition in the API server related to cron-triggers and SIGHUP. The parent API server creates a RPC connection when creating workflows from cron triggers. If a SIGUP signal happens after, the child inherits the connection, but it’s non-functional.

  • Sometimes Mistral was raising DetachedInstanceError for action defintions coming from cache. It’s now fixed by cloning objects before caching them.

  • [bug 1785654]

    Fixed a bug that prevents any action to run if the OpenStack catalog returned by Keystone is larger than 64kB if the backend is MySQL/MariaDB. The limit is now increased to 16MB.

  • Fix issue where next link in some list APIs, when invoked with pagination and filter(s), contained JSON string. This made next link an invalid URL. This issue impacted all REST APIs where filters can be used.

  • Fixed the issue when “join” task remained in WAITING state forever if the last inbound task failed and it was not a direct predecessor.

  • If an action execution fails but returns a result as a list (error=[]) the result of this action is assigned to the task execution ‘state_info’ field which is a string according to the DB model. On Python 3 it this list magically converts to a string. On Python 2.7 it doesn’t. The reason is probably in how SQLAlchemy works on different versions of Python. This has now been fixed with an explicit type coercion.

  • Workflow output sometimes was not calculated correctly due to the race condition between different transactions: the one that checks workflow completion (i.e. calls “check_and_complete”) and the one that processes action execution completion (i.e. calls “on_action_complete”). Calculating output sometimes was based on stale data cached by the SQLAlchemy session. To fix this, we just need to expire all objects in the session so that they are refreshed automatically if we read their state in order to make required calculations. The corresponding change was made.

  • Workflow execution integrity checker mechanism was too aggressive in case of big workflows that have many task executions in RUNNING state at the same time. The mechanism was selecting them all in one query and calling “on_action_complete” for each of them within a single DB transaction. That could lead to situations when this mechanism would totally block all normal workflow processing whereas it should only be a “last chance” aid in case of real infrastructure failures (e.g. MQ outage). This issue has been fixed by adding a configurable batch size, so that the checker can’t select more than this number of task executions in RUNNING state at once.

  • Action heartbeat checker was using scheduler to process expired action executions periodically. The side effect was that upon system reboot there may have been duplicating delayed calls in the database. So over time, the number of such calls could be significant and those jobs could even affect performance. This has now been fixed with regular threads without using scheduler at all. Additionally, the new configuration property “batch_size” has been added under the group “action_heartbeat” to control the maximum number of action executions processed during one iteration of the action execution heartbeat checker.

  • Removed DB polling from the logic that checks readiness of a “join” task which leads to situations when CPU was mostly occupied by scheduler that runs corresponding periodic jobs and that doesn’t let the workflow move forward with a proper speed. That happens in case if a workflow has lots of “join” tasks with many dependencies. It’s fixed now.

  • Eliminated an unnecessary update of the workflow execution object when processing “on_action_complete” operation. W/o this fix all such transactions would have to compete for the workflow executions table that causes lots of DB deadlocks (on MySQL) and transaction retries. In some cases the number of retries even exceeds the limit (currently hardcoded 50) and such tasks can be fixed only with the integrity checker over time.

  • Action execution checker didn’t set a security context before failing expired action executions. It caused ApplicationContextNotFoundException in case if corresponding workflow specification was not in the cache and Mistral had to load a DB object. The DB operation in turn was trying to access a security context which wasn’t set. It’s now fixed by setting an admin context in the action execution checker thread.

  • Workflow and join completion check logic is now simplified with using post transactional queue of operations which is a more generic version of action_queue module previously serving for scheduling action runs outside of the main DB transaction. Workflow completion check is now registered only once when a task completes which reduces clutter and it’s registered only if the task may potentially lead to workflow completion.

  • The header X-Target-Insecure previously accepted any string and used it for comparisons. This meant unless it was empty (or not provided) it would always evaluate as True. This change makes the validation stricter, only accepting “True” and “False” and converting these to boolean values. Any other value will return an error.


New Features

  • Introduce execution events and notification server and plugins for publishing these events for consumers. Event notification is defined per workflow execution and can be configured to notify on all the events or only for specific events.

  • Add missing Tacker actions to Mistral that includes vnf forwarding graph (vnffg), vnffg descriptor, network service (ns) and ns descriptor actions - vnffgd actions: create_vnffgd, delete_vnffgd, list_vnffgds, show_vnffgd - vnffg actions: create_vnffg, update_vnffg, delete_vnffg, list_vnffgs, show_vnffg - nsd actions: create_nsd, delete_nsd, list_nsds, show_nsd - ns actions: create_ns, delete_ns, list_nss, show_ns

  • Mistral now supports a publicize policy on actions and workflows which controls whether the users are allowed to create or update them. The default policy does not change which means that everyone can publish action or workflow unless specified differently in the policy.

  • Added new JavaScript evaluator py_mini_racer. The py_mini_racer package allows us to get a JavaScript evaluator that doesn’t require compilation. This is much lighter and easier to get started with.

  • Enable caching of action definitions in local memory. Now, instead of downloading the definitions from the database every time, mistral engine will store them in a local cache. This should reduce the number of database requests and improve the whole performance of the system. Cache ttl can be configured with action_definition_cache_time option from [engine] group. The default value is 60 seconds.

  • Added the config option “oslo_rpc_executor” sets an executor type used by Oslo Messaging framework. Defines how Oslo Messaging based RPC subsystem processes incoming calls. Allowed values: “eventlet”, “threading” and “blocking”. However, “blocking” is deprecated by the Oslo Messaging team and may be removed in the next versions. The reason of adding this option was in the issues occuring when using MySQLDb database driver and “eventlet” RPC executor. Once in a while, the system would hang on a deadlock caused by the fact that the DB driver wasn’t eventlet-friendly and dispatching of green threads didn’t work properly. That’s why “blocking” was used. Now it’s been proven that a combination of “eventlet” executor and PyMysql driver works well. The configuration option for the RPC executor though allows to rollback to “blocking” in case if regression is found, or also experiment with “threading”.

  • Added several config options that allow to tweak some aspects of the YAQL engine behavior.

  • Use of the parameter force to forcefully delete executions. Note using this parameter on unfinished executions might cause a cascade of errors.

  • Improves action with cc, bcc and html formatting.

  • Add Mistral actions for OpenStack Vitrage, the RCA service

  • Add support for creating workbooks in a namespace. Creating workbooks with same name is now possible inside the same project now. This feature is backward compatible.

    All existing workbooks are assumed to be in the default namespace, represented by an empty string. Also, if a workbook is created without a namespace specified, it is assumed to be in the default namespace.

    When a workbook is created, its namespace is inherited by the workflows contained within it. All operations on a particular workbook require combination of name and namespace to uniquely identify a workbook inside a project.

  • Added ‘safe-rerun’ policy to task-defaults section

  • Add Mistral actions for Openstack Manila, the fileshare management service.

  • Add Mistral actions for Openstack Qinling, the function management service.

  • Add Mistral actions for Openstack Zun, the container service.

Known Issues

  • Deleting unfinished executions might cause a cascade of errors, so the standard behaviour has been changed to delete only safe to delete executions and a new parameter force was added to forceful delete ignoring the state the execution is in.

Bug Fixes

  • Added new indexes on the task_execution_id column of the action_executions_v2 and workflow_executions_v2 tables.

  • Fixed how Mistral initializes a child YAQL context before evaluating YAQL expressions. The given data context needs to go through a special filter that prepares the data properly, does conversion into internal types etc. Also, without this change YAQL engine options are not applied properly.

  • Fixed jinja expression error handling where invalid expression could prevent action or task status to be correctly updated.

  • A regression was introduced that caused an error when logging a specific message. The string formatting was broken, which caused the logging to fail.

  • Fixed the logic of the ‘pause’ command. Before the fix Mistral wouldn’t run any commands specified in ‘on-success’, ‘on-error’ and ‘on-complete’ clauses following after the ‘pause’ command when a workflow was resumed after it. Now it works as expected. If Mistral encounters ‘pause’ in the list of commands it saves all commands following after it to the special backlog storage and when/if the workflow is later resumed it checks that storage and runs commands from it first.

  • A new config option section [keystone] is added. The options in the section is from keystoneauth by default. Please use them to talk with keystone session. If the option value is not set, to keep backward compatibility, Mistral will read the value from the same option in [keystone_authtoken].

    The override behvaior will be removed in Stein. Please update the options into [keystone] if you still want to use them.

  • Mistral was storing some internal information in task execution inbound context (‘task_executions_v2.in_contex’ DB field) to DB. This information was needed only to correctly implement the YAQL function task() without arguments. A fix was made to not store this information in the persistent storage and rather include it into a context view right before evaluating expressions where needed. So it slightly optimizes spaces in DB.

  • Used “passive_deletes=True” in the configuration of relationships in SQLAlchemy models. This improves deletion of graphs of related objects stored in DB because dependent objects don’t get loaded prior to deletion which also reduces the memory requirement on the system. More about using this flag can be found at:

  • Evaluation of final workflow context was very heavy in cases when the workflow had a lot of parallel tasks with large inbound contexts. Merging of those contexts in order to evaluate the workflow output consumed a lot of memory. Now this algorithm is rewritten with batched DB query and Python generators so that GS has a chance to destroy objects that have already been processed. Previously all task executions had to stay in memory until the end of the processing. The result is that now it consumes 3 times less memory on heavy cases.

  • Mistral was storing, in fact, two copies of a workflow environment, one in workflow parameters (the ‘params’ field) and another one in a context (the ‘context’ field). Now it’s stored only in workflow parameters. It saves space in DB and increases performance in case of big workflow environments.

  • Mistral was copying a workflow environment into all of their sub workflows. In case of a big workflow environment and a big number of sub workflows it caused serious problems, used additional space in DB and used a lot of RAM (e.g. when the ‘on-success’ clause has a lot of tasks where each one of them is a subworkflow). Now it is fixed by evaluating a workflow environment through the root execution reference.


New Features

  • A new YAQL/jinja2 expression function has been added for outputting JSON. It is json_dump and accepts one argument, which is the object to be serialised to JSON.

  • Mistral now support policy in code, which means if users didn’t modify any of policy rules, they can leave policy file (in json or yaml format) empty or just remove it all together. Because from now, Mistral keeps all default policies under mistral/policies module. Users can still modify/generate policy.yaml file which will override policy rules in code if those rules show in policy.yaml file.

  • Support to manage a cron-trigger instance by id.

  • Add yaml_parse and json_parse expression functions. Each accepts a string and will parse as either json or yaml, and return an object.

Deprecation Notes

  • The YAQL/jinja2 expression function json_pp has been deprecated and will be removed in the S cycle. json_dump should be used instead.

Bug Fixes

  • Remove ceilometerclient requirement. This library is not maintained and the ceilometer api is dead. So lets drop this integration.

Other Notes

  • Default policy.json file is now removed as Mistral now generate the default policies in code. Please be aware that when using that file in your environment.


New Features

  • Support to specify ‘action_region’ for OpenStack actions so that it’s possible to operate different resources in different regions in one single workflow.

  • Added ability to create public event triggers. Public event triggers are applied to all projects, i.e. workflows are triggered by event in any project. Currently public event triggers may be created only by admin but it can be changed in policy.json.

  • Creating and running workflows within a namespace. Workflows with the same name can be added to the same project as long as they are within a different namespace. This feature is backwards compatible.

    All existing workflows are assumed to be in the default namespace, represented by an empty string. Also if a workflow is created without a namespace spcified, it is assumed to be in the default namespace.

    When a workflow is being executed, the namespace is saved under params and passed to all its sub workflow executions. When looking for the next sub-workflow to run, the correct workflow will be found by name and namespace, where the namespace can be the workflow namespace or the default namespace. Workflows in the same namespace as the top workflow will be given a higher priority.

  • External OpenStack action mapping file could be specified at or mistral-db-mange script. For more details see ‘ –help’ or ‘mistral-db-manage –help’.

  • From now it is optional to list openstack modules in mapping file which you would not include into supported action set.

  • New function, called tasks, available from within an expression (Yaql, Jinja2). This function allows to filter all tasks of a user by workflow execution id and/or state. In addition it is possible to get tasks recursively and flatten the tasks list.

  • New parameter called ‘include_output’ added to action execution api. By default output field does not return when calling list action executions API

  • By default, admin user could get/list/update/delete other projects’ resources. In Pike, only workflow/execution are supported.

  • Mistral action developer can get the start time of a workflow execution by using <% execution().created_at %>.

  • The Mistral docker image and tooling has been updated to significantly ease the starting of a Mistral cluster. The setup now supports all-in-one and multi-container deployments. Also, the scripts were cleaned up and aligned with the Docker best practice.

Upgrade Notes

  • Run python tools/ --config-file <mistral-conf-file> to re-populate database.

Deprecation Notes

  • The config option ‘os-actions-endpoint-type’ is moved from DEFAULT group to ‘openstack_actions’ group.

Critical Issues

  • By default, output field will not return when calling list action executions. In the previous version it did, so if a user used this, and/or wants to get output field when calling list action executions API, it will be possible only by using the new include output parameter.

Bug Fixes

  • When we pass a workflow environment to workflow parameters using ‘env’ Mistral first evaluates it assuming that it can contain expressions (YAQL/Jinja) For example, one environment variable can be expressed through the other. In some cases it causes problems. For example, if the environment is too big and has many expressions, especially something like <% $ %> or <% env() %>. Also, in some cases we don’t want any evaluations to happen if we want to have some informative text in the environment containing expressions. In order to address that the ‘evaluate_env’ workflow parameter was added, defaulting to True for backwards compatibility. If it’s set to False then it disables evaluation of expressions in the environment.

  • Added support for referencing task and workflow context data, including environment variables via env(), when using YAQL/Jinja2 expressions inside AdHoc Actions. YAQL/Jinja2 expressions can reference env() and other context data in the base-input section.

  • Javascript support in docker image.

New Features

  • Aodh actions are now supported.

  • Gnocchi actions are now supported.

New Features

  • It is now possible to use the Bare metal (Ironic) API features introduced in API version 1.10 to 1.22.

Upgrade Notes

  • Required Ironic API version was bumped to ‘1.22’ (corresponding to Ironic 6.2.0 - Newton final release).

  • Due to the default Ironic API version change to ‘1.22’, new bare metal nodes created with ‘node_create’ action appears in “enroll” provision state instead of “available”. Please update your workflows accordingly.

Critical Issues

  • Mistral does not consider the initial task run as a retry but only considers the retry value after the failure of initial task execution.

New Features

  • Senlin actions are now supported.

Bug Fixes

  • [bug 1633345]

    User now could define the target region for the openstack actions. It could be done via API in X-Region-Name and X-Target-Region-Name in case of multi-vim feature is used.

New Features

  • Mistral now support usage of alternative RPC layer, that calls RabbitMQ directly instead of using Oslo.

  • Tasks support new flag ‘safe-rerun’. If it is set to ‘true’, a task would be re-run if executor dies during execution.

  • Mistral API server can be configured to handle https requests.

New Features

  • Mistral now supports authentication with KeyCloak server using OpenId Connect protocol.

  • Magnum action are now supported.

  • Role base access control was added.

  • Murano actions are now supported.

  • Tacker actions are now supported.

Upgrade Notes

  • During an upgrade to Newton, operators or administrators need to run python tools/ to populate database with Magnum action definitions.

Bug Fixes

  • Fix for YaqlEvaluationException in std.create_instance workflow.

New Features

  • Now user can provide custom message for fail/pause/success transition. e.g. - fail(msg=’error in task’): <% condition if any %>

  • New API for validating ad-hoc actions was added.


Tempest plugin has been implemented. Now Mistral tests can be run from Mistral repo as well as from Tempest repo.

Actions of several OpenStack services are supported out of the box in Mitaka, including Barbican, Cinder(V2), Swift, Trove, Zaqar and Mistral.

New Features

  • Add support for workflow sharing feature. users of one project can share workflows to other projects using this feature.

Upgrade Notes

  • During an upgrade to Mitaka, operators or administrators need to run python tools/ <service name> command to generate service action names and values for updating mistral/actions/openstack/mapping.json, then, run python tools/ to populate database. Please note, some services like Neutron, Swift, Zaqar don’t support the command yet.

Deprecation Notes

  • Usage of workflow name in the system(e.g. creating executions/cron-triggers , workfow CRUD operations, etc.) is deprecated, please use workflow UUID instead. The workflow sharing feature can only be used with workflow UUID.


Pre-installed Mistral docker image is now available to get quick idea of Mistral.

Security Issues

  • [bug 1521802] Fixing the problem that sometimes sub-workflow executions were run/saved under the wrong tenant in cron trigger periodic task in multi-tenancy deployment.

Bug Fixes

  • [bug 1518012] [bug 1513456]

    Fix concurrency issues by using READ_COMMITTED

    This release note describes bugs:
    • #1513456 - task stuck in RUNNING state when all action executions are finished regarding the problem and the fix.

    • #1518012- WF execution stays in RUNNING although task and action executions are in SUCCESS.

    This fix does not require any action from Mistral users and does not have any implications other than the bug fix.

    The state of a workflow execution was not updated even when all task executions were completed if some tasks finished at the same time as other tasks.

    Because we were using our connections with transaction isolation level = REPEATABLE_READ - Each process was using a snapshot of the DB created at the first read statement in that transaction. When a task finished and evaluated the state of all the other tasks it did not see the up-to-date state of those tasks - and so, because not all tasks were completed - the task did not change the workflow execution state.

    Similar behavior happened with multiple action executions under same task. On completion, each action execution checked the status of the other action executions and did not see the up-to-date state of these action execution - causing task execution to stay in RUNNING state.

    The solution is to change DB transaction isolation level from REPEATABLE_READ to READ_COMMITTED so process A can see changes committed in other transactions even if process A is in the middle of a transaction.

    A short explanation regarding the different isolation levels:

    • REPEATABLE_READ - while in transaction, the first read operation to the DB creates a snapshot of the entire DB so you are guarantee that all the data in the DB will remain the same until the end of the transaction.

      REPEATABLE_READ example:
      • ConnectionA selects from tableA in a transaction.

      • ConnectionB deletes all rows from tableB in a transaction.

      • ConnectionB commits.

      • ConnectionA loops over the rows of tableA and fetches from tableB using the tableA_tableB_FK - ConnectionA will get rows from tableB.

    • READ_COMMITTED - while in a transaction, every query to the DB will get the committed data.

      READ_COMMITTED example:
      • ConnectionA starts a transaction.

      • ConnectionB starts a transaction.

      • ConnectionA insert row to tableA and commits.

      • ConnectionB insert row to tableA.

      • ConnectionB selects tableA and gets two rows.

      • ConnectionB commits / rollback.

    Two good articles about isolation levels are: