dd23020c309787ed94f59efe941213f83db51a6e
72 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
Tim Burke
|
ae6300af86 |
wsgi: Reap stale workers (after a timeout) following a reload
Add a new tunable, `stale_worker_timeout`, defaulting to 86400 (i.e. 24 hours). Once this time elapses following a reload, the manager process will issue SIGKILLs to any remaining stale workers. This gives operators a way to configure a limit for how long old code and configs may still be running in their cluster. To enable this, the temporary reload child (which waits for the reload to complete then closes the accept socket on all the old workers) has grown the ability to send state to the re-exec'ed manager. Currently, this is limited to just the set of pre-re-exec child PIDs and their reload times, though it was designed to be reasonably extensible. This allows the new manager to recognize stale workers as they exit instead of logging Ignoring wait() result from unknown PID ... With the improved knowledge of subprocesses, we can kick the log level for the above message up from info to warning; we no longer expect it to trigger in practice. Drive-by: Add logging to ServersPerPortStrategy.register_worker_exit that's comparable to what WorkersStrategy does. Change-Id: I8227939d04fda8db66fb2f131f2c71ce8741c7d9 |
||
|
Tim Burke
|
a55a48ffc8 |
docs: Call out that xprofile is not intended for production
Change-Id: I1e9d4d5df403040d69db93a08647cd0abe1b8037 |
||
|
Alistair Coles
|
e9abfd76ee |
backend ratelimit: support reloadable config file
Add support for a backend_ratelimit_conf_path option in the [filter:backend_ratelimit] config. If specified then the middleware will give precedence to config options from that file over config options from the [filter:backend_ratelimit] section. The path defaults to /etc/swift/backend-ratelimit.conf. The config file is periodically reloaded and any changed options are applied. The middleware will log a warning the first time it fails to load a config file that had previously been successfully loaded. The middleware also logs at info level when it first successfully loads a config file that had previously failed to be loaded. Otherwise, the middleware will log when a config file is loaded that results in the config being changed. Change-Id: I6554e37c6ab5b0a260f99b54169cb90ab5718f81 |
||
|
Zuul
|
24acc6e56b | Merge "Add backend rate limiting middleware" | ||
|
Matthew Oliver
|
bf4edefce4 |
DB Replicator: Add handoff_delete option
Currently the object-replicator has an option called `handoff_delete` which allows us to define the the number of replicas which are ensured in swift. Once a handoff node ensures that many successful responses it can go ahead and delete the handoff partition. By default it's 'auto' or rather the number of primary nodes. But this can be reduced. It's useful in draining full disks, but has to be used carefully. This patch adds the same option to the DB replicator and works the same way. But instead of deleting a partition it's done at the per DB level. Because it's done in the DB Replicator level it means the option is now available to both the Account and Container replicators. Change-Id: Ide739a6d805bda20071c7977f5083574a5345a33 |
||
|
Alistair Coles
|
ccaf49a00c |
Add backend rate limiting middleware
This is a fairly blunt tool: ratelimiting is per device and applied independently in each worker, but this at least provides some limit to disk IO on backend servers. GET, HEAD, PUT, POST, DELETE, UPDATE and REPLICATE methods may be rate-limited. Only requests with a path starting '<device>/<partition>', where <partition> can be cast to an integer, will be rate-limited. Other requests, including, for example, recon requests with paths such as 'recon/version', are unconditionally forwarded to the next app in the pipeline. OPTIONS and SSYNC methods are not rate-limited. Note that SSYNC sub-requests are passed directly to the object server app and will not pass though this middleware. Change-Id: I78b59a081698a6bff0d74cbac7525e28f7b5d7c1 |
||
|
Tim Burke
|
c374a7a851 |
Allow floats for all intervals
Change-Id: I91e9bc02d94fe7ea6e89307305705c383087845a |
||
|
Tim Burke
|
9eb81f6e69 |
Allow replication servers to handle all request methods
Previously, the replication_server setting could take one of three states: * If unspecified, the server would handle all available methods. * If "true", "yes", "on", etc. it would only handle replication methods (REPLICATE, SSYNC). * If any other value (including blank), it would only handle non-replication methods. However, because SSYNC tunnels PUTs, POSTs, and DELETEs through the same object-server app that's responding to SSYNC, setting `replication_server = true` would break the protocol. This has been the case ever since ssync was introduced. Now, get rid of that second state -- operators can still set `replication_server = false` as a principle-of-least-privilege guard to ensure proxy-servers can't make replication requests, but replication servers will be able to serve all traffic. This will allow replication servers to be used as general internal-to-the-cluster endpoints, leaving non-replication servers to handle client-driven traffic. Closes-Bug: #1446873 Change-Id: Ica2b41a52d11cb10c94fa8ad780a201318c4fc87 |
||
|
Clay Gerrard
|
4601548dab |
Deprecate per-service auto_create_account_prefix
If we move it to constraints it's more globally accessible in our code, but more importantly it's more obvious to ops that everything breaks if you try to mis-configure different values per-service. Change-Id: Ib8f7d08bc48da12be5671abe91a17ae2b49ecfee |
||
|
Clay Gerrard
|
e7cd8df5e9 |
Add option for debug query logging
Change-Id: Ic16b505a37748f50dc155212671efb45e2c5051f |
||
|
Gilles Biannic
|
a4cc353375 |
Make log format for requests configurable
Add the log_msg_template option in proxy-server.conf and log_format in a/c/o-server.conf. It is a string parsable by Python's format() function. Some fields containing user data might be anonymized by using log_anonymization_method and log_anonymization_salt. Change-Id: I29e30ef45fe3f8a026e7897127ffae08a6a80cd9 |
||
|
Clay Gerrard
|
06cf5d298f |
Add databases_per_second to db daemons
Most daemons have a "go as fast as you can then sleep for 30 seconds" strategy towards resource utilization; the object-updater and object-auditor however have some "X_per_second" options that allow operators much better control over how they spend their I/O budget. This change extends that pattern into the account-replicator, container-replicator, and container-sharder which have been known to peg CPUs when they're not IO limited. Partial-Bug: #1784753 Change-Id: Ib7f2497794fa2f384a1a6ab500b657c624426384 |
||
|
FatemaKhalid
|
cfeb32c66b |
Adding keep_idle config value to socket
User can cofigure KEEPIDLE time for sockets in TCP connection. The default value is the old value which is 600. Change-Id: Ib7fb166deb8a87ae4e97ba0671048b1ec079a2ef Closes-Bug:1759606 |
||
|
Samuel Merritt
|
8e651a2d3d |
Add fallocate_reserve to account and container servers.
The object server can be configured to leave a certain amount of disk space free; default is 1%. This is useful in avoiding 100%-full filesystems, as those can get Swift in a state where the filesystem is too full to write tombstones, so you can't delete objects to free up space. When a cluster has accounts/containers and objects on the same disks, then you can wind up with a 100%-full disk since account and container servers don't respect fallocate_reserve. This commit makes account and container servers respect fallocate_reserve so that disks shared between account/container and object rings won't get 100% full. When a disk's free space falls below the configured reserve, account and container PUT, POST, and REPLICATE requests will fail with a 507 status code. These are the operations that can significantly increase the disk space used by a given database. I called the parameter "fallocate_reserve" for consistency with the object server. No actual fallocate() call happens under Swift's control in the account or container servers (sqlite3 might make such a call, but it's out of our hands). Change-Id: I083442eef14bf83c0ea717b1decb3e6b56dbf1d0 |
||
|
Samuel Merritt
|
47fed6f2f9 |
Add handoffs-only mode to DB replicators.
The object reconstructor has a handoffs-only mode that is very useful when a cluster requires rapid rebalancing, like when disks are nearing fullness. This mode's goal is to remove handoff partitions from disks without spending effort on primary partitions. The object replicator has a similar mode, though it varies in some details. This commit adds a handoffs-only mode to the account and container replicators. Change-Id: I588b151ee65ae49d204bd6bf58555504c15edf9f Closes-Bug: 1668399 |
||
|
Alistair Coles
|
93fc9d2de8 |
Add cautionary note re delay_reaping in account-server.conf-sample
Change-Id: I2c3eea783321338316eecf467d30ba0b3217256c Related-Bug: #1514528 |
||
|
Ondřej Nový
|
99a13d9386 |
Fixed rysnc -> rsync typo
Change-Id: I671b4206072c6e22f4ae38033502336ec32e86ad |
||
|
Peter Lisák
|
ed772236c7 |
Change schedule priority of daemon/server in config
The goal is to modify schedule priority and I/O scheduling class and priority of daemon/server via configuration. Setting is optional, default keeps current behaviour. Use case: Prioritize object-server to object-auditor, because all user's requests needed to be served in peak hours and audit could wait. Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> DocImpact Change-Id: I1018a18f4706daabdb84574ffd9a58d831e68396 |
||
|
KATO Tomoyuki
|
bd29a3e3c7 |
Remove the duplicated word 'be'
Change-Id: I3ff4e7135d8d10c62dfcde90f34befe328ac39b2 |
||
|
Jenkins
|
a403faadd4 | Merge "Allow fallocate_reserve to be a percentage" | ||
|
Shashirekha Gundur
|
cf48e75c25 |
change default ports for servers
Changing the recommended ports for Swift services from ports 6000-6002 to unused ports 6200-6202; so they do not conflict with X-Windows or other services. Updated SAIO docs. DocImpact Closes-Bug: #1521339 Change-Id: Ie1c778b159792c8e259e2a54cb86051686ac9d18 |
||
|
Andy McCrae
|
0da9da5131 |
Allow fallocate_reserve to be a percentage
Add the ability to set the fallocate_reserve value as a percentage. This happens automatically when adding the '%' at the end of the value. Having the ability to set a % of free space rather than a byte value is useful especially when drive sizes are heterogenous. The default for fallocate_reserve has been adjusted to 1%, having the fallocate_reserve set seems sensible for all deploys and percentages are far safer to default than byte values (across drives of any size). Tests added for using fallocate_reserve as a percentage. Duplicate tests for fallocate_reserve have been removed. Docs updated to reflect the fallocate_reserve change. Change-Id: I4aea613a708205c917e81d6b2861396655e73238 |
||
|
gh159m
|
b5311f63db |
Removed default value for log_statsd_host
Multiple files and documents showed that log_statsd_host had a default value, usually localhost. This was incorrect, instead setting a value for log_statsd_host enables statsd logging. Removed any reference of log_statsd_host having a default value. Also changed descriptions to show setting a value enables logging. Change-Id: I3ca5c0e8b8e4981de3aa6db0c476072b5a59723d Closes-Bug: #1542227 |
||
|
Peter Lisák
|
28c4b7310f |
Unification of manpages and conf-samples (default values, etc)
Change-Id: I47a3127ef698b4bd1537b1562901ee9c2b5924d4 |
||
|
Alistair Coles
|
1a2b54fc0a |
Fix missing *-replicator conf sections in deployment guide
The doc for these sections was missing because of an rst error - the source is there in rst file but didn't make it into the html output. Add doc for per_diff and max_diffs in account and container doc sections. Also, fix a bunch of other sphinx build errors and most of the warnings. Change-Id: If9ed2619b2f92c6c65a94f41d8819db8726d3893 |
||
|
Romain LE DISEZ
|
71f6fd025e |
Allows to configure the rsync modules where the replicators will send data
Currently, the rsync module where the replicators send data is static. It
forbids administrators to set rsync configuration based on their current
deployment or needs.
As an example, the rsyncd configuration example encourages to set a connections
limit for the modules account, container and object. It permits to protect
devices from excessives parallels connections, because it would impact
performances.
On a server with many devices, it is tempting to increase this number
proportionally, but nothing guarantees that the distribution of the connections
will be balanced. In the worst scenario, a single device can receive all the
connections, which is a severe impact on performances.
This commit adds a new option named 'rsync_module' to the *-replicator sections
of the *-server configuration file. This configuration variable can be
extrapolated with device attributes like ip, port, device, zone, ... by using
the format {NAME}. eg:
rsync_module = {replication_ip}::object_{device}
With this configuration, an administrators can solve the problem of connections
distribution by creating one module per device in rsyncd configuration.
The default values are backward compatible:
{replication_ip}::account
{replication_ip}::container
{replication_ip}::object
Option vm_test_mode is deprecated by this commit, but backward compatibility is
maintained. The option is only effective when rsync_module is not set. In that
case, {replication_port} is appended to the default value of rsync_module.
Change-Id: Iad91df50dadbe96c921181797799b4444323ce2e
|
||
|
Charles Hsu
|
345785837f |
Remove error_suppression_interval, error_suppression_limit options.
These two options is belong to proxy-server, not account-replicator. Change-Id: Ie030ecffd213e56db32df77c69b847479d96308f |
||
|
Joanna H. Huang
|
af8d842076 |
Replaced setting run_pause with standard interval
The deprecated directive `run_pause` should be replaced with the more standard one `interval`. The `run_pause` should be still supported for backward compatibility. This patch updates object replicator to use `interval` and support `run_pause`. It also updates its sample config and documentation. Co-Authored-By: Joanna H. Huang <joanna.huitzu.huang@gmail.com> Co-Authored-By: Kamil Rykowski <kamil.rykowski@intel.com> Change-Id: Ie2a3414a96a94efb9273ff53a80b9d90c74fff09 Closes-Bug: #1364735 |
||
|
Prashanth Pai
|
9c33bbde69 |
Allow rsync to use compression
From rsync's man page: -z, --compress With this option, rsync compresses the file data as it is sent to the destination machine, which reduces the amount of data being transmitted -- something that is useful over a slow connection. A configurable option has been added to allow rsync to compress, but only if the remote node is in a different region than the local one. NOTE: Objects that are already compressed (for example: .tar.gz, .mp3) might slow down the syncing process. On wire compression can also be extended to ssync later in a different change if required. In case of ssync, we could explore faster compression libraries like lz4. rsync uses zlib which is slow but offers higher compression ratio. Change-Id: Ic9b9cbff9b5e68bef8257b522cc352fc3544db3c Signed-off-by: Prashanth Pai <ppai@redhat.com> |
||
|
Rafael Rivero
|
c1f6569c00 |
Fixes several typos (Swift)
Corrects spelling errors found in comments. Change-Id: I228a888e3f256569ea32ef1613092dbd63e13c62 |
||
|
John Dickinson
|
b7281cf2c5 |
make the bind_port config setting required
In a long-term effort to change the recommended ports for Swift, the first step is to require the bind_port in config files. Later, we can change the recommended setting. Anyone currently explicitly setting the ports will not be affected. Anyone not setting the ports will need to specify them to match their rings. DocImpact Change-Id: Icca83a263acdd0afc9016424a3e9f8c15e944789 |
||
|
Matthew Oliver
|
090baa1fa9 |
Swift configuration parameter audit
This change is the result of an audit through the config parameters
provided by swift and how/if they are addressed in the swift
documentation. The documentation being the sample config files in
the /etc directory or the documentation.
This change is only concerned with the config files in etc/ next
I will look at the documentation in the doc/ folder.
This change makes the following assumptions:
- Unless stated otherwise, the commented out parameter in the
sample configuration is the default for swift.
- When the default in the code differs from that of the sample
configuration, the default in the code is correct.
Container reconciler:
Parameter: interval
- code: 30
- config: 300
Result: config = 30
Object Expirer:
Parameter: recon_cache_path
- code: /var/cache/swift
- config: Parameter missing
Result: Add parameter
swift-dispersion-populate && swift-dispersion-report
Parameter: auth_version
- code: 1.0
- config: 2.0 (due to being a confusing example of how to setup
version 2.0).
Result: Added 'auth_version = 1.0' to the right section (showing
default and make the sample configuration for auth version
2.0 easier to understand.
swift-drive-audit:
Parameter: log_file_pattern
- code: /var/log/kern.*[!.][!g][!z]
- config: /var/log/kern*
Result: config = /var/log/kern.*[!.][!g][!z]
NOTE: swift-drive-audit uses a parameter called device_dir which
defaults to '/srv/node'. In all other swift binaries/services
there is a similar parameter called devices which stores the
same thing. This is an inconsistency which I haven't fixed
as this could break existing swift clusters out in the wild.
Proxy Server:
Parameter: object_chunk_size
- code: 65536
- config: 8192
Result: config = 65536
Parameter: client_chunk_size
- code: 65536
- config: 8192
Result: config = 65536
Parameter: strict_cors_mode
- code: True
- config: No parameter
Result: config = True
Account and Container replicator configuration confusion:
NOTES:
The account and container replicators have parameters:
- interval
- run_pause
Both of these are loaded into the same variable in code:
self.interval = int(conf.get('interval') or
conf.get('run_pause') or 30)
If a user sets both to different values then interval is used.
Result: Update the configuration to make this more clear.
DocImpact
Change-Id: Iaadbb1a6284f8b3e0801bc343b29772f70f4bf6e
|
||
|
gholt
|
2d00f7b7ba |
New log_max_line_length option.
Log lines can get quite large, as we previously noticed with rsync error log lines. We added a setting to cap those, but it really looks like we should have just done this overall limit. We noticed the issue when we switched to UDP syslogging and it would occasionally blow past the 16436 lo MTU! This causes Python's logging code to get an error and hilarity ensues. Change-Id: I44bdbe68babd58da58c14360379e8fef8a6b75f7 |
||
|
zhang-hare
|
f5caac43ac |
Add profiling middleware in Swift
The profile middleware provide a tool to profile Swift code on the fly and collect statistic data for performance analysis. An native simple Web UI is also provided to help query and visualize the data. Change-Id: I6a1554b2f8dc22e9c8cd20cff6743513eb9acc05 Implements: blueprint profiling-middleware |
||
|
Jenkins
|
a2126add0b | Merge "Set default wsgi workers to cpu_count" | ||
|
Newptone
|
5c1a7871d9 |
Unified format of boolean params in conf files
In swift conf files, boolean options use different format: some use true/false, and some use True/False. This patch is aim to using lowcase true/false to unify boolean params formats in swift conf files. Fix Bug #1203421 Change-Id: I3e1bfc6e43231f51e0710aa54869f3774ee896b1 |
||
|
Clay Gerrard
|
de3acec4bf |
Set default wsgi workers to cpu_count
Change the default value of wsgi workers from 1 to auto. The new default value for workers in the proxy, container, account & object wsgi servers will spawn as many workers per process as you have cpu cores. This will not be ideal for some configurations, but it's much more likely to produce a successful out of the box deployment. Inspect the number of cpu_cores using python's multiprocessing when available. Multiprocessing was added in python 2.6, but I know I've compiled python without it before on accident. The cpu_count method seems to be pretty system agnostic, but it says it can raise NotImplementedError or sometimes return 0. Add a new utility method 'config_auto_int_value' to pull an integer out of the config which has a dynamic default. * drive by s/container/proxy/ in proxy-server.conf.5 * fix misplaced max_clients in *-server.conf-sample * update doc/development_saio to force workers = 1 DocImpact Change-Id: Ifa563d22952c902ab8cbe1d339ba385413c54e95 |
||
|
Samuel Merritt
|
efdb0e3681 |
Make sample configs more readable.
Inject some empty lines to avoid the wall-of-text effect and to make it a little clearer which descriptions go with which options. Change-Id: I58914b83dad76ea5ca330903a246bee7ffaeba83 |
||
|
Donagh McCabe
|
34e2ab3f31 |
account-reaper warns if not making progress
DocImpact If account reaper has not managed to clean out an account after a long period, it prints a message to the log (you can search your system looking for such messages). Introduce reap_warn_after config variable to determine when to emit the message (defaults to 30 days). Also fix bug 1181995 (edge case where object name is an empty string) Change-Id: Ic0dfee04742d06b6a51b59f302d7a272d7c1de92 |
||
|
Sergey Kraynev
|
ea7858176b |
Implementation of replication servers
Support separate replication ip address: - Added new function in utils. This function provides ability to select separate IP address for replication service. - Db_replicator and object replicators were changed. Replication process uses new function now. Replication network parameters: - Replication network fields (replication_ip, replication_port) support was added to device dictionary in swift-ring-builder script. - Changes were made to support new fields in search, show and set_info functions. Implementation of replication servers: - Separate replication servers use the same code as normal replication servers, but with replication_server parameter = True. When using a separate replication network, the non-replication servers set replication_server = False. When there is no separate replication network (the default case), replication_server is not included in the config. DocImpact Change-Id: Ie9af5bdcdf9241c355e36053ca4adfe49dc35bd0 Implements: blueprint dedicated-replication-network |
||
|
Peter Portante
|
2d42b37303 |
Add the max_clients parameter to bound clients
The new max_clients parameter allows one full control over the maximum number of client requests that will be handled by a given worker for any of the proxy, account, container or object servers. Lowering the number of clients handled per worker, and raising the number of workers can lessen the impact that a CPU intensive, or blocking, request can have on other requests served by the same worker. If the maximum number of clients is set to one, then a given worker will not perform another accept(2) call while processing, allowing other workers a chance to process it. DocImpact Signed-off-by: Peter Portante <peter.portante@redhat.com> Change-Id: Ic01430f7a6c5ff48d7aa349dc86a5f8ac463a420 |
||
|
Jenkins
|
249a65461e | Merge "Adding speed limit options for DB auditor" | ||
|
Samuel Merritt
|
a4a047c4ec |
Fix descriptions in sample configs.
Change-Id: I7aca3c6cafd9391031f7a10cc233f99e81ee0393 |
||
|
yuan-zhou
|
09370862ca |
Adding speed limit options for DB auditor
Fix bug 1129760 Without speed limit, DB auditor will likely consume high CPU% on storage node. That will highly impact the cluster's performance. This patch adds two options for account/container auditor: - containers_per_second: Maximum containers audited per second - accounts_per_second: Maximum accounts audited per second DocImpact Change-Id: I9faa506438185a83ca77db4906969328624d015f |
||
|
Jenkins
|
23f33b2069 | Merge "Make statsd sample rate behave better." | ||
|
gholt
|
87a42ab9ca |
Added fallocate_reserve option
Some systems behave badly when they completely run out of space. To alleviate this problem, you can set the fallocate_reserve conf value to a number of bytes to "reserve" on each disk. When the disk free space falls at or below this amount, fallocate calls will fail, even if the underlying OS fallocate call would succeed. For example, a fallocate_reserve of 5368709120 (5G) would make all fallocate calls fail, even for zero-byte files, when the disk free space falls under 5G. The default fallocate_reserve is 0, meaning "no reserve", and so the software behaves exactly as it always has unless you set this conf value to something non-zero. Also fixed ring builder's search_devs doc bugs. Related: To get rsync to do the same, see https://github.com/rackspace/cloudfiles-rsync Specifically, see this patch: https://github.com/rackspace/cloudfiles-rsync/blob/master/debian/patches/limit-fs-fullness.diff DocImpact Change-Id: I8db176ae0ca5b41c9bcfeb7cb8abb31c2e614527 |
||
|
Darrell Bishop
|
8801b74090 |
Make statsd sample rate behave better.
As Dieter pointed out in bug 1090495 (https://bugs.launchpad.net/swift/+bug/1090495), the volume of metrics can vary wildly between StatsD metrics. This patch implements a partial solution by reducing the sample_rate used for known high-volume metrics (operational experience will need to inform this over time) and introducing a new tunable, log_statsd_sample_rate_factor which is multiplied by the sample_rate for every statsd stat. This tunable can be used to reduce StatsD traffic proportionally for all metrics and is intended to replace log_statsd_default_sample_rate, which is left alone for backward-compatibility, should anyone be using it. This patch also includes a drive-by fix for log_udp_port which wasn't being converted to an int (I didn't verify that actually causes trouble in SysLogHandler(), but it's definitely an improvement regardles). Change-Id: Id404636e3629f6431cf1c4e64a143959750a3c23 |
||
|
Jenkins
|
8b770aa55e | Merge "Add config option to turn eventlet debug on/off" | ||
|
Chuck Thier
|
4c6a354483 |
Add config option to turn eventlet debug on/off
By default, this will be turned off. This will cause eventlet to not print stack traces to stderr which can be very annoying on production systems. It is still recommended to turn it on for development or debuging purposes. DocImpact Change-Id: I5e5b902d3d9ed85f784549e53f2ee2fc87cbe2e5 |
||
|
clayg
|
3a70112d03 |
Add config of server start timeouts for probetests
Currently the timeout for a wsgi server successfully binding to a port and for a probetest background service to finish starting are hard coded to 30 seconds. While a reasonable default for most configurations, a small virtualized environment may need a little more time in order for probe tests to complete successfully. This patch adds a 'bind_timeout' option to the DEFAULT section of the main wsgi servers' config. Also a new [probe_test] section and 'check_server_timeout' option to test.conf DocImpact Change-Id: Ibcaff153c7633bbf32e460fd9dbf04932eddb56f |