Atom feed of this document
  
Juno -  Juno -  Juno -  Juno -  Juno -  Juno -  Juno -  Juno - 

 Chapter 5. Data processing service

The Data processing service (sahara) provides a scalable data-processing stack and associated management interfaces.

The following tables provide a comprehensive list of the Data processing service configuration options.

Table 5.1. Description of AMQP configuration options
Configuration option = Default value Description
[DEFAULT]
amqp_auto_delete = False (BoolOpt) Auto-delete queues in amqp.
amqp_durable_queues = False (BoolOpt) Use durable queues in amqp.
control_exchange = openstack (StrOpt) The default exchange under which topics are scoped. May be overridden by an exchange name specified in the transport_url option.
notification_driver = [] (MultiStrOpt) Driver or drivers to handle sending notifications.
notification_level = INFO (StrOpt) Notification level for outgoing notifications
notification_publisher_id = None (StrOpt) Notification publisher_id for outgoing notifications
notification_topics = notifications (ListOpt) AMQP topic used for OpenStack notifications.
transport_url = None (StrOpt) A URL representing the messaging driver to use and its full configuration. If not set, we fall back to the rpc_backend option and driver specific configuration.

Table 5.2. Description of authorization token configuration options
Configuration option = Default value Description
[keystone_authtoken]
admin_password = None (StrOpt) Keystone account password
admin_tenant_name = admin (StrOpt) Keystone service account tenant name to validate user tokens
admin_token = None (StrOpt) This option is deprecated and may be removed in a future release. Single shared secret with the Keystone configuration used for bootstrapping a Keystone installation, or otherwise bypassing the normal authentication process. This option should not be used, use `admin_user` and `admin_password` instead.
admin_user = None (StrOpt) Keystone account username
auth_admin_prefix = (StrOpt) Prefix to prepend at the beginning of the path. Deprecated, use identity_uri.
auth_host = 127.0.0.1 (StrOpt) Host providing the admin Identity API endpoint. Deprecated, use identity_uri.
auth_port = 35357 (IntOpt) Port of the admin Identity API endpoint. Deprecated, use identity_uri.
auth_protocol = https (StrOpt) Protocol of the admin Identity API endpoint (http or https). Deprecated, use identity_uri.
auth_uri = None (StrOpt) Complete public Identity API endpoint
auth_version = None (StrOpt) API version of the admin Identity API endpoint
cache = None (StrOpt) Env key for the swift cache
cafile = None (StrOpt) A PEM encoded Certificate Authority to use when verifying HTTPs connections. Defaults to system CAs.
certfile = None (StrOpt) Required if Keystone server requires client certificate
check_revocations_for_cached = False (BoolOpt) If true, the revocation list will be checked for cached tokens. This requires that PKI tokens are configured on the Keystone server.
delay_auth_decision = False (BoolOpt) Do not handle authorization requests within the middleware, but delegate the authorization decision to downstream WSGI components
enforce_token_bind = permissive (StrOpt) Used to control the use and type of token binding. Can be set to: "disabled" to not check token binding. "permissive" (default) to validate binding information if the bind type is of a form known to the server and ignore it if not. "strict" like "permissive" but if the bind type is unknown the token will be rejected. "required" any form of token binding is needed to be allowed. Finally the name of a binding method that must be present in tokens.
hash_algorithms = md5 (ListOpt) Hash algorithms to use for hashing PKI tokens. This may be a single algorithm or multiple. The algorithms are those supported by Python standard hashlib.new(). The hashes will be tried in the order given, so put the preferred one first for performance. The result of the first hash will be stored in the cache. This will typically be set to multiple values only while migrating from a less secure algorithm to a more secure one. Once all the old tokens are expired this option should be set to a single value for better performance.
http_connect_timeout = None (BoolOpt) Request timeout value for communicating with Identity API server.
http_request_max_retries = 3 (IntOpt) How many times are we trying to reconnect when communicating with Identity API Server.
identity_uri = None (StrOpt) Complete admin Identity API endpoint. This should specify the unversioned root endpoint e.g. https://localhost:35357/
include_service_catalog = True (BoolOpt) (optional) indicate whether to set the X-Service-Catalog header. If False, middleware will not ask for service catalog on token validation and will not set the X-Service-Catalog header.
insecure = False (BoolOpt) Verify HTTPS connections.
keyfile = None (StrOpt) Required if Keystone server requires client certificate
memcache_secret_key = None (StrOpt) (optional, mandatory if memcache_security_strategy is defined) this string is used for key derivation.
memcache_security_strategy = None (StrOpt) (optional) if defined, indicate whether token data should be authenticated or authenticated and encrypted. Acceptable values are MAC or ENCRYPT. If MAC, token data is authenticated (with HMAC) in the cache. If ENCRYPT, token data is encrypted and authenticated in the cache. If the value is not one of these options or empty, auth_token will raise an exception on initialization.
revocation_cache_time = 10 (IntOpt) Determines the frequency at which the list of revoked tokens is retrieved from the Identity service (in seconds). A high number of revocation events combined with a low cache duration may significantly reduce performance.
signing_dir = None (StrOpt) Directory used to cache files related to PKI tokens
token_cache_time = 300 (IntOpt) In order to prevent excessive effort spent validating tokens, the middleware caches previously-seen tokens for a configurable duration (in seconds). Set to -1 to disable caching completely.

Table 5.3. Description of common configuration options
Configuration option = Default value Description
[DEFAULT]
cluster_remote_threshold = 70 (IntOpt) The same as global_remote_threshold, but for a single cluster.
compute_topology_file = etc/sahara/compute.topology (StrOpt) File with nova compute topology. It should contain mapping between nova computes and racks. File format: compute1 /rack1 compute2 /rack2 compute3 /rack2
detach_volume_timeout = 300 (IntOpt) Timeout for detaching volumes from instance (in seconds).
enable_data_locality = False (BoolOpt) Enables data locality for hadoop cluster. Also enables data locality for Swift used by hadoop. If enabled, 'compute_topology' and 'swift_topology' configuration parameters should point to OpenStack and Swift topology correspondingly.
enable_hypervisor_awareness = True (BoolOpt) Enables four-level topology for data locality. Works only if corresponding plugin supports such mode.
enable_notifications = False (BoolOpt) Enables sending notifications to Ceilometer
global_remote_threshold = 100 (IntOpt) Maximum number of remote operations that will be running at the same time. Note that each remote operation requires its own process to run.
host = (StrOpt) Hostname or IP address that will be used to listen on.
infrastructure_engine = direct (StrOpt) An engine which will be used to provision infrastructure for Hadoop cluster.
job_binary_max_KB = 5120 (IntOpt) Maximum length of job binary data in kilobytes that may be stored or retrieved in a single operation.
job_workflow_postfix = (StrOpt) Postfix for storing jobs in hdfs. Will be added to '/user/<hdfs user>/' path.
lock_path = None (StrOpt) Directory to use for lock files.
memcached_servers = None (ListOpt) Memcached servers or None for in process cache.
min_transient_cluster_active_time = 30 (IntOpt) Minimal "lifetime" in seconds for a transient cluster. Cluster is guaranteed to be "alive" within this time period.
node_domain = novalocal (StrOpt) The suffix of the node's FQDN. In nova-network that is the dhcp_domain config parameter.
os_region_name = None (StrOpt) Region name used to get services endpoints.
periodic_enable = True (BoolOpt) Enable periodic tasks.
periodic_fuzzy_delay = 60 (IntOpt) Range in seconds to randomly delay when starting the periodic task scheduler to reduce stampeding. (Disable by setting to 0).
periodic_interval_max = 60 (IntOpt) Max interval size between periodic tasks execution in seconds.
plugins = vanilla, hdp, spark (ListOpt) List of plugins to be loaded. Sahara preserves the order of the list when returning it.
port = 8386 (IntOpt) Port that will be used to listen on.
remote = ssh (StrOpt) A method for Sahara to execute commands on VMs.
run_external_periodic_tasks = True (BoolOpt) Some periodic tasks can be run in a separate process. Should we run them here?
swift_topology_file = etc/sahara/swift.topology (StrOpt) File with Swift topology. It should contain mapping between Swift nodes and racks. File format: node1 /rack1 node2 /rack2 node3 /rack2
use_floating_ips = True (BoolOpt) If set to True, Sahara will use floating IPs to communicate with instances. To make sure that all instances have floating IPs assigned in Nova Network set "auto_assign_floating_ip=True" in nova.conf. If Neutron is used for networking, make sure that all Node Groups have "floating_ip_pool" parameter defined.
use_identity_api_v3 = True (BoolOpt) Enables Sahara to use Keystone API v3. If that flag is disabled, per-job clusters will not be terminated automatically.
use_namespaces = False (BoolOpt) Use network namespaces for communication (only valid to use in conjunction with use_neutron=True).
use_neutron = False (BoolOpt) Use Neutron Networking (False indicates the use of Nova networking).
[conductor]
use_local = True (BoolOpt) Perform sahara-conductor operations locally.
[keystone_authtoken]
memcached_servers = None (ListOpt) Optionally specify a list of memcached server(s) to use for caching. If left undefined, tokens will instead be cached in-process.

Table 5.4. Description of database configuration options
Configuration option = Default value Description
[DEFAULT]
db_driver = sahara.db (StrOpt) Driver to use for database access.
[database]
backend = sqlalchemy (StrOpt) The back end to use for the database.
connection = None (StrOpt) The SQLAlchemy connection string to use to connect to the database.
connection_debug = 0 (IntOpt) Verbosity of SQL debugging information: 0=None, 100=Everything.
connection_trace = False (BoolOpt) Add Python stack traces to SQL as comment strings.
db_inc_retry_interval = True (BoolOpt) If True, increases the interval between database connection retries up to db_max_retry_interval.
db_max_retries = 20 (IntOpt) Maximum database connection retries before error is raised. Set to -1 to specify an infinite retry count.
db_max_retry_interval = 10 (IntOpt) If db_inc_retry_interval is set, the maximum seconds between database connection retries.
db_retry_interval = 1 (IntOpt) Seconds between database connection retries.
idle_timeout = 3600 (IntOpt) Timeout before idle SQL connections are reaped.
max_overflow = None (IntOpt) If set, use this value for max_overflow with SQLAlchemy.
max_pool_size = None (IntOpt) Maximum number of SQL connections to keep open in a pool.
max_retries = 10 (IntOpt) Maximum db connection retries during startup. Set to -1 to specify an infinite retry count.
min_pool_size = 1 (IntOpt) Minimum number of SQL connections to keep open in a pool.
mysql_sql_mode = TRADITIONAL (StrOpt) The SQL mode to be used for MySQL sessions. This option, including the default, overrides any server-set SQL mode. To use whatever SQL mode is set by the server configuration, set this to no value. Example: mysql_sql_mode=
pool_timeout = None (IntOpt) If set, use this value for pool_timeout with SQLAlchemy.
retry_interval = 10 (IntOpt) Interval between retries of opening a SQL connection.
slave_connection = None (StrOpt) The SQLAlchemy connection string to use to connect to the slave database.
sqlite_db = oslo.sqlite (StrOpt) The file name to use with SQLite.
sqlite_synchronous = True (BoolOpt) If True, SQLite uses synchronous mode.
use_db_reconnect = False (BoolOpt) Enable the experimental use of database reconnect on connection lost.

Table 5.5. Description of domain configuration options
Configuration option = Default value Description
[DEFAULT]
proxy_user_domain_name = None (StrOpt) The domain Sahara will use to create new proxy users for Swift object access.
proxy_user_role_names = Member (ListOpt) A list of the role names that the proxy user should assume through trust for Swift object access.
use_domain_for_proxy_users = False (BoolOpt) Enables Sahara to use a domain for creating temporary proxy users to access Swift. If this is enabled a domain must be created for Sahara to use.

Table 5.6. Description of logging configuration options
Configuration option = Default value Description
[DEFAULT]
disable_process_locking = False (BoolOpt) Enables or disables inter-process locks.

Table 5.7. Description of logging configuration options
Configuration option = Default value Description
[DEFAULT]
debug = False (BoolOpt) Print debugging output (set logging level to DEBUG instead of default WARNING level).
default_log_levels = amqplib=WARN, qpid.messaging=INFO, stevedore=INFO, eventlet.wsgi.server=WARN, sqlalchemy=WARN, boto=WARN, suds=INFO, keystone=INFO, paramiko=WARN, requests=WARN, iso8601=WARN (ListOpt) List of logger=LEVEL pairs.
fatal_deprecations = False (BoolOpt) Enables or disables fatal status of deprecations.
instance_format = "[instance: %(uuid)s] " (StrOpt) The format for an instance that is passed with the log message.
instance_uuid_format = "[instance: %(uuid)s] " (StrOpt) The format for an instance UUID that is passed with the log message.
log_config_append = None (StrOpt) The name of a logging configuration file. This file is appended to any existing logging configuration files. For details about logging configuration files, see the Python logging module documentation.
log_date_format = %Y-%m-%d %H:%M:%S (StrOpt) Format string for %%(asctime)s in log records. Default: %(default)s .
log_dir = None (StrOpt) (Optional) The base directory used for relative --log-file paths.
log_exchange = False (BoolOpt) Log request/response exchange details: environ, headers and bodies.
log_file = None (StrOpt) (Optional) Name of log file to output to. If no default is set, logging will go to stdout.
log_format = None (StrOpt) DEPRECATED. A logging.Formatter log message format string which may use any of the available logging.LogRecord attributes. This option is deprecated. Please use logging_context_format_string and logging_default_format_string instead.
logging_context_format_string = %(asctime)s.%(msecs)03d %(process)d %(levelname)s %(name)s [%(request_id)s %(user_identity)s] %(instance)s%(message)s (StrOpt) Format string to use for log messages with context.
logging_debug_format_suffix = %(funcName)s %(pathname)s:%(lineno)d (StrOpt) Data to append to log format when level is DEBUG.
logging_default_format_string = %(asctime)s.%(msecs)03d %(process)d %(levelname)s %(name)s [-] %(instance)s%(message)s (StrOpt) Format string to use for log messages without context.
logging_exception_prefix = %(asctime)s.%(msecs)03d %(process)d TRACE %(name)s %(instance)s (StrOpt) Prefix each line of exception output with this format.
publish_errors = False (BoolOpt) Enables or disables publication of error events.
syslog_log_facility = LOG_USER (StrOpt) Syslog facility to receive log lines.
use_stderr = True (BoolOpt) Log output to standard error.
use_syslog = False (BoolOpt) Use syslog for logging. Existing syslog format is DEPRECATED during I, and will change in J to honor RFC5424.
use_syslog_rfc_format = False (BoolOpt) (Optional) Enables or disables syslog rfc5424 format for logging. If enabled, prefixes the MSG part of the syslog message with APP-NAME (RFC5424). The format without the APP-NAME is deprecated in I, and will be removed in J.
verbose = False (BoolOpt) Print more verbose output (set logging level to INFO instead of default WARNING level).

Table 5.8. Description of Qpid configuration options
Configuration option = Default value Description
[DEFAULT]
qpid_heartbeat = 60 (IntOpt) Seconds between connection keepalive heartbeats.
qpid_hostname = localhost (StrOpt) Qpid broker hostname.
qpid_hosts = $qpid_hostname:$qpid_port (ListOpt) Qpid HA cluster host:port pairs.
qpid_password = (StrOpt) Password for Qpid connection.
qpid_port = 5672 (IntOpt) Qpid broker port.
qpid_protocol = tcp (StrOpt) Transport to use, either 'tcp' or 'ssl'.
qpid_receiver_capacity = 1 (IntOpt) The number of prefetched messages held by receiver.
qpid_sasl_mechanisms = (StrOpt) Space separated list of SASL mechanisms to use for auth.
qpid_tcp_nodelay = True (BoolOpt) Whether to disable the Nagle algorithm.
qpid_topology_version = 1 (IntOpt) The qpid topology version to use. Version 1 is what was originally used by impl_qpid. Version 2 includes some backwards-incompatible changes that allow broker federation to work. Users should update to version 2 when they are able to take everything down, as it requires a clean break.
qpid_username = (StrOpt) Username for Qpid connection.

Table 5.9. Description of RabbitMQ configuration options
Configuration option = Default value Description
[DEFAULT]
kombu_reconnect_delay = 1.0 (FloatOpt) How long to wait before reconnecting in response to an AMQP consumer cancel notification.
kombu_ssl_ca_certs = (StrOpt) SSL certification authority file (valid only if SSL enabled).
kombu_ssl_certfile = (StrOpt) SSL cert file (valid only if SSL enabled).
kombu_ssl_keyfile = (StrOpt) SSL key file (valid only if SSL enabled).
kombu_ssl_version = (StrOpt) SSL version to use (valid only if SSL enabled). valid values are TLSv1, SSLv23 and SSLv3. SSLv2 may be available on some distributions.
rabbit_ha_queues = False (BoolOpt) Use HA queues in RabbitMQ (x-ha-policy: all). If you change this option, you must wipe the RabbitMQ database.
rabbit_host = localhost (StrOpt) The RabbitMQ broker address where a single node is used.
rabbit_hosts = $rabbit_host:$rabbit_port (ListOpt) RabbitMQ HA cluster host:port pairs.
rabbit_login_method = AMQPLAIN (StrOpt) the RabbitMQ login method
rabbit_max_retries = 0 (IntOpt) Maximum number of RabbitMQ connection retries. Default is 0 (infinite retry count).
rabbit_password = guest (StrOpt) The RabbitMQ password.
rabbit_port = 5672 (IntOpt) The RabbitMQ broker port where a single node is used.
rabbit_retry_backoff = 2 (IntOpt) How long to backoff for between retries when connecting to RabbitMQ.
rabbit_retry_interval = 1 (IntOpt) How frequently to retry connecting with RabbitMQ.
rabbit_use_ssl = False (BoolOpt) Connect over SSL for RabbitMQ.
rabbit_userid = guest (StrOpt) The RabbitMQ userid.
rabbit_virtual_host = / (StrOpt) The RabbitMQ virtual host.

Table 5.10. Description of Redis configuration options
Configuration option = Default value Description
[matchmaker_redis]
host = 127.0.0.1 (StrOpt) Host to locate redis.
password = None (StrOpt) Password for Redis server (optional).
port = 6379 (IntOpt) Use this port to connect to redis host.
[matchmaker_ring]
ringfile = /etc/oslo/matchmaker_ring.json (StrOpt) Matchmaker ring file (JSON).

Table 5.11. Description of RPC configuration options
Configuration option = Default value Description
[DEFAULT]
matchmaker_heartbeat_freq = 300 (IntOpt) Heartbeat frequency.
matchmaker_heartbeat_ttl = 600 (IntOpt) Heartbeat time-to-live.
rpc_backend = rabbit (StrOpt) The messaging driver to use, defaults to rabbit. Other drivers include qpid and zmq.
rpc_cast_timeout = 30 (IntOpt) Seconds to wait before a cast expires (TTL). Only supported by impl_zmq.
rpc_conn_pool_size = 30 (IntOpt) Size of RPC connection pool.
rpc_response_timeout = 60 (IntOpt) Seconds to wait for a response from a call.
rpc_thread_pool_size = 64 (IntOpt) Size of RPC greenthread pool.

Table 5.12. Description of testing configuration options
Configuration option = Default value Description
[DEFAULT]
fake_rabbit = False (BoolOpt) If passed, use a fake RabbitMQ provider.

Table 5.13. Description of ZeroMQ configuration options
Configuration option = Default value Description
[DEFAULT]
rpc_zmq_bind_address = * (StrOpt) ZeroMQ bind address. Should be a wildcard (*), an ethernet interface, or IP. The "host" option should point or resolve to this address.
rpc_zmq_contexts = 1 (IntOpt) Number of ZeroMQ contexts, defaults to 1.
rpc_zmq_host = localhost (StrOpt) Name of this node. Must be a valid hostname, FQDN, or IP address. Must match "host" option, if running Nova.
rpc_zmq_ipc_dir = /var/run/openstack (StrOpt) Directory for holding IPC sockets.
rpc_zmq_matchmaker = oslo.messaging._drivers.matchmaker.MatchMakerLocalhost (StrOpt) MatchMaker driver.
rpc_zmq_port = 9501 (IntOpt) ZeroMQ receiver listening port.
rpc_zmq_topic_backlog = None (IntOpt) Maximum number of ingress messages to locally buffer per topic. Default is unlimited.

Questions? Discuss on ask.openstack.org
Found an error? Report a bug against this page

loading table of contents...