34b4bf34d223c7c289dde5441e93ebfea90eade9
Commit Graph

265 Commits

Author SHA1 Message Date
Jenkins
34b4bf34d2 Merge "Added discoverable capabilities." 2013年11月28日 00:11:35 +00:00
Jenkins
d8e46eba47 Merge "slightly less early quorum" 2013年11月27日 15:59:37 +00:00
Peter Portante
6e313e957d Fix for memcache middleware configuration
The documentation rightly said to use "memcache_max_connections", but
the code was looking for "max_connections", and only looking for it in
proxy-server.conf, not in memcache.conf as a fall back.
This commit brings the code coverage for the memcache middleware to
100%.
Closes-Bug: 1252893
Change-Id: I6ea64baa2f961a09d60b977b40d5baf842449ece
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013年11月26日 18:03:33 +00:00
Michael Barton
7207926cff slightly less early quorum
The early quorum change has maybe added a little bit too much
eventual to the consistency of requests in Swift, and users can
sometimes get unexpected
results.
This change gives us a knob to turn in finding the right balance,
by adding a timeout where pending requests can finish after quorum
is achieved.
Change-Id: Ife91aaa8653e75b01313bbcf19072181739e932c
2013年11月25日 21:25:55 +00:00
Richard (Rick) Hawkins
2c4bf81464 Added discoverable capabilities.
Swift can now optionally be configured to allow requests to '/info',
providing information about the swift cluster. Additionally a HMAC
signed requests to
'/info?swiftinfo_sig=<sign>&swiftinfo_expires=<expires>' can be
configured allowing privileged access to more sensitive information
not meant to be public.
DocImpact
Change-Id: I2379360fbfe3d9e9e8b25f1dc34517d199574495
Implements: blueprint capabilities
Closes-Bug: #1245694 
2013年11月22日 15:54:13 -06:00
gholt
c859ebf5ce Per device replication_lock
New replication_one_per_device (True by default)
that restricts incoming REPLICATION requests to
one per device, replication_currency allowing.
Also has replication_lock_timeout (15 by default)
to control how long a request will wait to obtain
a replication device lock before giving up.
This should be very useful in that you can be
assured any concurrent REPLICATION requests are
each writing to distinct devices. If you have 100
devices on a server, you can set
replication_concurrency to 100 and be confident
that, even if 100 replication requests were
executing concurrently, they'd each be writing to
separate devices. Before, all 100 could end up
writing to the same device, bringing it to a
horrible crawl.
NOTE: This is only for ssync replication. The
current default rsync replication still has the
potentially horrible behavior.
Change-Id: I36e99a3d7e100699c76db6d3a4846514537ff685
2013年11月22日 21:40:29 +00:00
Jenkins
37220086d7 Merge "improve docs in etc/dispersion.conf-sample" 2013年11月21日 03:02:08 +00:00
gholt
a80c720af5 Object replication ssync (an rsync alternative)
For this commit, ssync is just a direct replacement for how
we use rsync. Assuming we switch over to ssync completely
someday and drop rsync, we will then be able to improve the
algorithms even further (removing local objects as we
successfully transfer each one rather than waiting for whole
partitions, using an index.db with hash-trees, etc., etc.)
For easier review, this commit can be thought of in distinct
parts:
1) New global_conf_callback functionality for allowing
 services to perform setup code before workers, etc. are
 launched. (This is then used by ssync in the object
 server to create a cross-worker semaphore to restrict
 concurrent incoming replication.)
2) A bit of shifting of items up from object server and
 replicator to diskfile or DEFAULT conf sections for
 better sharing of the same settings. conn_timeout,
 node_timeout, client_timeout, network_chunk_size,
 disk_chunk_size.
3) Modifications to the object server and replicator to
 optionally use ssync in place of rsync. This is done in
 a generic enough way that switching to FutureSync should
 be easy someday.
4) The biggest part, and (at least for now) completely
 optional part, are the new ssync_sender and
 ssync_receiver files. Nice and isolated for easier
 testing and visibility into test coverage, etc.
All the usual logging, statsd, recon, etc. instrumentation
is still there when using ssync, just as it is when using
rsync.
Beyond the essential error and exceptional condition
logging, I have not added any additional instrumentation at
this time. Unless there is something someone finds super
pressing to have added to the logging, I think such
additions would be better as separate change reviews.
FOR NOW, IT IS NOT RECOMMENDED TO USE SSYNC ON PRODUCTION
CLUSTERS. Some of us will be in a limited fashion to look
for any subtle issues, tuning, etc. but generally ssync is
an experimental feature. In its current implementation it is
probably going to be a bit slower than rsync, but if all
goes according to plan it will end up much faster.
There are no comparisions yet between ssync and rsync other
than some raw virtual machine testing I've done to show it
should compete well enough once we can put it in use in the
real world.
If you Tweet, Google+, or whatever, be sure to indicate it's
experimental. It'd be best to keep it out of deployment
guides, howtos, etc. until we all figure out if we like it,
find it to be stable, etc.
Change-Id: If003dcc6f4109e2d2a42f4873a0779110fff16d6
2013年11月07日 16:52:01 +00:00
Kun Huang
264766127e improve docs in etc/dispersion.conf-sample
1. add a comment to hint using a new account for using dispersion tools
2. change sample url for keystone from 'saio' to 'localhost'
Change-Id: I4683f5eb0af534b39112f1b7420f67d569c29b3a
2013年10月28日 17:41:09 +08:00
Jenkins
920680ffdd Merge "Faster swift-dispersion-populate" 2013年10月22日 17:55:59 +00:00
Florian Hines
42f4b150e3 Faster swift-dispersion-populate
- Makes swift-dispersion-populate a bit faster when using a larger
 dispersion_coverage with a larger part_power.
- Adds option to only run population for container OR objects
- Adds option to let you resume population at given point (useful if you
 need to resume population after a previous run error'd out or the
 like) by specifying which suffix to start at.
The original populate just randomly used uuid4().hex as a suffix on the
container/object names until all the partition's required where covered.
This isn't a big deal if you're only doing 1% coverage on a ring with a
small part power but takes ages if you're doing 100% on a larger ring.
Change-Id: I52f890a774412c1d6179f12db9081aedc58b6bc2
2013年09月05日 18:12:15 -05:00
John Dickinson
a9aec73098 add reseller_admin_role to sample config
Change-Id: Ia8e62eef5af9e849e86c3ff14ce7f8aaa5f21abf
2013年09月05日 12:27:18 -07:00
Jenkins
621ea520a5 Merge "Added container listing ratelimiting" 2013年08月23日 15:46:53 +00:00
Jenkins
b4c5d6b6c0 Merge "Make the length of a line logged configurable" 2013年08月22日 22:41:04 +00:00
Greg Lange
176a34161f Make the length of a line logged configurable
Failed calls to rysnc can
 result in very long log lines.
These lines are
 mostly made up
of file paths and are
 not always useful.
This change will
 allow for reducing the
length of these
 lines logged if desired.
Change-Id: I9a28f19eadc07757da9d42b0d7be1ed82170d732
2013年08月21日 19:38:24 +00:00
gholt
52eca4d8a7 Implements configurable swift_owner_headers
These are headers that will be stripped unless the WSGI environment
contains a true value for 'swift_owner'. The exact definition of a
swift_owner is up to the auth system in use, but usually indicates
administrative responsibilities.
DocImpact
Change-Id: I972772fbbd235414e00130ca663428e8750cabca
2013年08月15日 16:42:58 -07:00
gholt
c8795e6e85 Added container listing ratelimiting
Change-Id: If4e9cfe4e4c743de1f39704acf849164cf3f0bd0
2013年08月14日 12:40:25 +00:00
Vincent Untz
7f1aa9d1e8 Allow dispersion tools to use keystone server with insecure certificate
The swift-dispersion-populate and swift-dispersion-report tools now
accept a --insecure option.
Also, dispersion.conf now has a keystone_api_insecure option.
Default is obviously to use the secure path.
DocImpact
Change-Id: I4000352e547d9ce5b08ade54e0c886281caff891
2013年08月05日 22:44:12 +02:00
Jenkins
f830c9a266 Merge "Obscure the X-Auth-Token in proxy log" 2013年07月30日 20:58:59 +00:00
Jenkins
a2126add0b Merge "Set default wsgi workers to cpu_count" 2013年07月30日 19:12:28 +00:00
Donagh McCabe
eb99e8f84c Obscure the X-Auth-Token in proxy log
The X-Auth-Token is sensitive data. If revealed to an unauthozied person,
they can now make requests against an account until the token expires.
This implementation maintains current behavior (i.e, the token
is logged). Implementers can choose to set reveal_sensitive_prefix
to (e.g.) 12 so only first 12 characters of the token are logged.
Or, set to 0 to replace the token with "...".
DocImpact
Part of bug #1004114
Change-Id: Iecefa843d8f9ef59b9dcf0860e7a4d0e186a6cb5
2013年07月30日 09:37:27 +01:00
Jenkins
82aeacf5bf Merge "Allow floating point value for dispersion_coverage" 2013年07月29日 21:53:19 +00:00
Jenkins
3349013dff Merge "Configuration options for error regex and log file in the config now" 2013年07月29日 18:23:39 +00:00
Jenkins
e155f6da18 Merge "Add bulk middleware to proxy-server.conf-sample" 2013年07月26日 23:04:20 +00:00
Marcelo Martins
d2dd3e5488 Configuration options for error regex and log file in the config now
Making it possible for one to overwrite the default set of regexes
used to search for device block errors in the log file. Also making
the log file naming pattern configurable by setting them in the
drive-audit.conf file.
Updating "Detecting Failed Drives" section on the admin guide as well.
Change-Id: I7bd3acffed196da3e09db4c9dcbb48a20bdd1cf0
2013年07月23日 07:24:29 -05:00
Newptone
5c1a7871d9 Unified format of boolean params in conf files
In swift conf files, boolean options use different
format: some use true/false, and some use True/False.
This patch is aim to using lowcase true/false to unify
boolean params formats in swift conf files.
Fix Bug #1203421
Change-Id: I3e1bfc6e43231f51e0710aa54869f3774ee896b1
2013年07月23日 15:40:05 +08:00
Clay Gerrard
de3acec4bf Set default wsgi workers to cpu_count
Change the default value of wsgi workers from 1 to auto. The new default
value for workers in the proxy, container, account & object wsgi servers will
spawn as many workers per process as you have cpu cores.
This will not be ideal for some configurations, but it's much more likely to
produce a successful out of the box deployment.
Inspect the number of cpu_cores using python's multiprocessing when available.
Multiprocessing was added in python 2.6, but I know I've compiled python
without it before on accident. The cpu_count method seems to be pretty system
agnostic, but it says it can raise NotImplementedError or sometimes return 0.
Add a new utility method 'config_auto_int_value' to pull an integer out of the
config which has a dynamic default.
 * drive by s/container/proxy/ in proxy-server.conf.5
 * fix misplaced max_clients in *-server.conf-sample
 * update doc/development_saio to force workers = 1
DocImpact
Change-Id: Ifa563d22952c902ab8cbe1d339ba385413c54e95
2013年07月18日 22:57:18 -07:00
Chmouel Boudjnah
89ccd95996 Add bulk middleware to proxy-server.conf-sample
- Fixes bug 1201844.
Change-Id: I8eed54d0a17a0c6b746ed616634fc9adb89e5f37
2013年07月17日 15:46:03 +02:00
Thomas Leaman
5449155fb0 Allow floating point value for dispersion_coverage
For systems with very large numbers of partitions, 1% dispersion
coverage may simply be too much/take too long. This fix allows <1
values to be used for dispersion_coverage.
DocImpact
Change-Id: I5ed35b69754d55a410e66e658b3854de57c7666b
2013年07月09日 09:29:07 +00:00
David Goetz
043bfb77f4 Record some simple object stats in the object auditor
Change-Id: I043a80c38091f59ce6707730363a4b43b29ae6ec
2013年07月02日 13:41:18 -07:00
David Goetz
9f942b1256 Allow SLOs to be made up of other SLOs
We've gone back and forth about this. In the initial commit, it couldn't
possibly work because you wouldn't be able to get the Etags to match. Then it
was expressly disallowed with a custom error message, and now its allowed. The
reason we're allowing it is that 1,000 segments isn't enough for some use cases
and we decided its better than just upping the number of allowed segments. The
code to make it work isn't all that complicated and it allows for virtually
unlimited SLO object size. There is also a new configurable limit on the
maximum connection time for both SLOs and DLOs defaulting to 1 day. This will
hopefully alleviate worries about infinite requests. Think I'll leave the
python-swift client support for nested SLOs to somebody else though :).
DocImpact
Change-Id: Id16187481b37e716d2bd09bdbab8cc87537e3ddd
2013年06月26日 09:44:33 -07:00
Jenkins
edf4068c8b Merge "Local write affinity for object PUT requests." 2013年06月26日 04:39:53 +00:00
Kun Huang
9fc55ee2b4 Add sample rsyslog.conf.
Give users a sample rsyslog.conf to manage their logs easily based on
LOCAL0 swift used. In this patch I offer some choices to output logs
via commnet lines or uncomment.
Change-Id: I2fe150a6e3d164a989c3520c0b7f032897a71f18
2013年06月25日 10:24:26 +08:00
Samuel Merritt
d9f2a76973 Local write affinity for object PUT requests.
The proxy can now be configured to prefer local object servers for PUT
requests, where "local" is governed by the "write_affinity". The
"write_affinity_node_count" setting controls how many local object
servers to try before giving up and going on to remote ones.
I chose to simply re-order the object servers instead of filtering out
nonlocal ones so that, if all of the local ones are down, clients can
still get successful responses (just slower).
The goal is to trade availability for throughput. By writing to local
object servers across fast LAN links, clients get better throughput
than if the object servers were far away over slow WAN links. The
downside, of course, is that data availability (not durability) may
suffer when drives fail.
The default configuration has no write affinity in it, so the default
behavior is unchanged.
Added some words about these settings to the admin guide.
DocImpact
Change-Id: I09a0bd00524544ff627a3bccdcdc48f40720a86e
2013年06月23日 22:04:56 -07:00
Samuel Merritt
e8e6bffc95 Clean up some remnants of StaticWeb's and TempURL's custom caching.
As of e499b91, these config values and functions are unused in StaticWeb.
As of 2e155e5, the comment in TempURL is false.
Change-Id: I75b631ece8a9a6075c406765361629c549c449f6
2013年06月21日 17:37:06 -07:00
Kun Huang
5c8785aaee Add max_header_size to swift.conf-sample and relative UT
1. Add explanation of MAX_HEADER_SIZE into swift.conf-sample as same as
other settings in swift.conf. Especially point out the default size of
header line in eventlet is 8192 which is the main reason why we set 8192
for MAX_HEADER_SIZE in swift.
2. Add some unit tests to check valid settings in swift.conf. Test cases
in test_constraints use /etc/swift/swift.conf if exists, and if any
wrong settings are in it (MAX_META_VALE > MAX_META_OVERALL_SIZE), swift's
unit test must fail. These new unit tests is used in this case.
Change-Id: I7bb21951d46050163c1b7bceac8d49302b9209f7
2013年06月19日 23:45:38 +08:00
Jenkins
5bfd2d798d Merge "Add parallelism to object expirer daemon." 2013年06月11日 22:48:24 +00:00
Jenkins
b63b5d590a Merge "Use threadpools in the object server for performance." 2013年06月11日 22:47:07 +00:00
Samuel Merritt
f559c50acb Local read affinity for GET/HEAD requests.
Now you can configure the proxy server to read from "local" primary
nodes first, where "local" is governed by the newly-introduced
"read_affinity" setting in the proxy config. This is desirable when
the network links between regions/zones are of varying capacities; in
such a case, it's a good idea to prefer fetching data from closer
backends.
The new setting looks like rN[zM]=P, where N is the region number, M
is the optional zone number, and P is the priority. Multiple values
can be specified by separating them with commas. The priority for
nodes that don't match anything is a very large number, so they'll
sort last.
This only affects the ordering of the primary nodes; it doesn't affect
handoffs at all. Further, while the primary nodes are reordered for
all requests, it only matters for GET/HEAD requests since handling the
other verbs ends up making concurrent requests to *all* the primary
nodes, so ordering is irrelevant.
Note that the default proxy config does not have this setting turned
on, so the default configuration's behavior is unaffected.
blueprint multi-region
Change-Id: Iea4cd367ed37fe5ee69b63234541d358d29963a4
2013年06月10日 16:51:47 -07:00
Jenkins
4077252f23 Merge "Make sample configs more readable." 2013年06月10日 14:24:49 +00:00
Greg Lange
209c5ec418 Add parallelism to object expirer daemon.
Two types of parallelism are added:
- concurrency to speed up what a single process does
- a way to run multiple daemons to work on different parts of the work
DocImpact
Change-Id: I48997f68eb2fd8de19a5ee8b9fcdf76dde2ba0ab
2013年06月07日 20:49:47 +00:00
Samuel Merritt
b491549ac2 Use threadpools in the object server for performance.
Without a (per-disk) threadpool, requests to a slow disk would affect
all clients by blocking the entire eventlet reactor on
read/write/etc. The slower the disk, the worse the performance. On an
object server, you frequently have at least one slow disk due to
auditing and replication activity sucking up all the available IO. By
kicking those blocking calls out to a separate OS thread, we let the
eventlet reactor make progress in other greenthreads, and by having a
per-disk pool, we ensure that one slow disk can't suck up all the
resources of an entire object server.
There were a few blocking calls that were done with eventlet.tpool,
but that's a fixed-size global threadpool, so I moved them to the
per-disk threadpools. If the object server is configured not to use
per-disk threadpools, (i.e. threads_per_disk = 0, which is the
default), those call sites will still ultimately end up using
eventlet.tpool.execute. You won't end up blocking a whole object
server while waiting for a huge fsync.
If you decide not to use threadpools, the only extra overhead should
be a few extra Python function calls here and there. This is
accomplished by setting threads_per_disk = 0 in the config.
blueprint concurrent-disk-io
Change-Id: I490f8753d926fdcee3a0c65c5aaf715bc2b7c290
2013年06月07日 13:06:04 -07:00
Pete Zaitcev
4b5db1dd0a Improve config samples
- Add proxy-logging to multinode. We had it since Folsom and people
 still forget it, resulting in missing logs.
- Use correct name, for ease hit with '*' in vi at least.
Admittedly trivial changes, which I meant to hold until Leah's major
doc improvement lands, but I'm tired of keeping stuff like this in
my working repo.
Change-Id: I44f80c51d6d7329a9b696e67fcb8a895db63e497
2013年06月06日 19:41:13 -06:00
Samuel Merritt
efdb0e3681 Make sample configs more readable.
Inject some empty lines to avoid the wall-of-text effect and to make
it a little clearer which descriptions go with which options.
Change-Id: I58914b83dad76ea5ca330903a246bee7ffaeba83
2013年06月06日 15:35:19 -07:00
Dieter Plaetinck
442fd83a8b implement an rsync_bwlimit setting for object replicator
Change-Id: I8789d6e4d22de83db9a2760d51a94eb56a48c3b5
2013年05月31日 15:57:19 -04:00
Donagh McCabe
34e2ab3f31 account-reaper warns if not making progress
DocImpact
If account reaper has not managed to clean out an account after a long
period, it prints a message to the log (you can search your system looking
for such messages). Introduce reap_warn_after config variable to determine
when to emit the message (defaults to 30 days).
Also fix bug 1181995 (edge case where object name is an empty string)
Change-Id: Ic0dfee04742d06b6a51b59f302d7a272d7c1de92
2013年05月22日 15:07:17 +01:00
Jenkins
959f5e7ea8 Merge "Implementation of replication servers" 2013年05月16日 02:43:49 +00:00
Jenkins
50157243dd Merge "Refactor Bulk middleware to handle long running requests" 2013年05月15日 23:14:15 +00:00
David Goetz
af2607c457 Refactor Bulk middleware to handle long running requests
Change-Id: I8ea0ff86518d453597faae44ec3918298e2d5147
2013年05月08日 10:00:21 -07:00
Clay Gerrard
34f5085c3e conf.d support
Allow Swift daemons and servers to optionally accept a directory as the
configuration parameter. Directory based configuration leverages
ConfigParser's native multi-file support. Files ending in '.conf' in the
given directory are parsed in lexicographical order. Filenames starting with
'.' are ignored. A mixture of file and directory configuration paths is not
supported - if the configuration path is a file behavior is unchanged.
 * update swift-init to search for conf.d paths when building servers
 (e.g. /etc/swift/proxy-server.conf.d/)
 * new script swift-config can be used to inspect the cumulative configuration
 * pull a little bit of code out of run_wsgi and test separately
 * fix example config bug for the proxy servers client_disconnect option
 * added section on directory based configuration to deployment guide
DocImpact
Implements: blueprint confd
Change-Id: I89b0f48e538117f28590cf6698401f74ef58003b
2013年04月30日 00:17:46 -07:00