2722e49a8c844e2d20369e8aed230a972eb07e58
1191 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
Christian Schwede
|
e1140666d6 |
Add support to increase object ring partition power
This patch adds methods to increase the partition power of an existing object ring without downtime for the users using a 3-step process. Data won't be moved to other nodes; objects using the new increased partition power will be located on the same device and are hardlinked to avoid data movement. 1. A new setting "next_part_power" will be added to the rings, and once the proxy server reloaded the rings it will send this value to the object servers on any write operation. Object servers will now create a hard-link in the new location to the original DiskFile object. Already existing data will be relinked using a new tool in the new locations using hardlinks. 2. The actual partition power itself will be increased. Servers will now use the new partition power to read from and write to. No longer required hard links in the old object location have to be removed now by the relinker tool; the relinker tool reads the next_part_power setting to find object locations that need to be cleaned up. 3. The "next_part_power" flag will be removed. This mostly implements the spec in [1]; however it's not using an "epoch" as described there. The idea of the epoch was to store data using different partition powers in their own namespace to avoid conflicts with auditors and replicators as well as being able to abort such an operation and just remove the new tree. This would require some heavy change of the on-disk data layout, and other object-server implementations would be required to adopt this scheme too. Instead the object-replicator is now aware that there is a partition power increase in progress and will skip replication of data in that storage policy; the relinker tool should be simply run and afterwards the partition power will be increased. This shouldn't take that much time (it's only walking the filesystem and hardlinking); impact should be low therefore. The relinker should be run on all storage nodes at the same time in parallel to decrease the required time (though this is not mandatory). Failures during relinking should not affect cluster operations - relinking can be even aborted manually and restarted later. Auditors are not quarantining objects written to a path with a different partition power and therefore working as before (though they are reading each object twice in the worst case before the no longer needed hard links are removed). Co-Authored-By: Alistair Coles <alistair.coles@hpe.com> Co-Authored-By: Matthew Oliver <matt@oliver.net.au> Co-Authored-By: Tim Burke <tim.burke@gmail.com> [1] https://specs.openstack.org/openstack/swift-specs/specs/in_progress/ increasing_partition_power.html Change-Id: I7d6371a04f5c1c4adbb8733a71f3c177ee5448bb |
||
|
Jenkins
|
73da215bdf | Merge "Ring doc cleanups" | ||
|
Jenkins
|
4315093a28 | Merge "More Global EC doc updates" | ||
|
Tim Burke
|
b5ee8c88d0 |
Ring doc cleanups
Change-Id: Ie51ea5c729341da793887e1e25c1e45301a96751 |
||
|
Clay Gerrard
|
4c7839d256 |
More Global EC doc updates
Soften the language about inefficiency on read and strengthen the language encouraging the use of read affinity and composite rings. Change-Id: Idc81a8c71e74ae28d384759700c5268d77ae3c85 |
||
|
Jenkins
|
41c8f1330f | Merge "Update Global EC docs with reference to composite rings" | ||
|
Alistair Coles
|
9665252352 |
Update Global EC docs with reference to composite rings
* In light of the composite rings feature being added [1], downgrade the warnings about EC Duplication [2] being experimental. * Add links from Global EC docs to composite rings and per-policy proxy config features. * Add discussion of using EC duplication with composite rings. * Update Known Issues. [1] Related-Change: I0d8928b55020592f8e75321d1f7678688301d797 [2] Related-Change: Idd155401982a2c48110c30b480966a863f6bd305 Change-Id: Id97a4899255945a6eaeacfef12fd29a2580588df |
||
|
Jenkins
|
7bbe02b290 | Merge "Allow to configure the nameservers in cname_lookup" | ||
|
Tim Burke
|
d51ecb4ecc |
Remove threads_per_disk from object-server.conf manpages
That option was removed entirely in 2.8.0. Change-Id: Ib40f816936429a78e622d3737bb0b064225d2d44 Related-Change: Ie76be5c8a74d60a1330627caace19e06d1b9383c |
||
|
Romain LE DISEZ
|
420e73fabd |
Allow to configure the nameservers in cname_lookup
For various reasons, an operator might want to use specifics nameservers instead of the systems ones to resolve CNAME in cname_lookup. This patch creates a new configuration variable nameservers which accepts a list of nameservers separated by commas. If not specified or empty, systems namservers are used as previously. Co-Authored-By: Tim Burke <tim.burke@gmail.com> Change-Id: I34219e6ab7e45678c1a80ff76a1ac0730c64ddde |
||
|
Jenkins
|
db45e1dd69 | Merge "Add structure to storage policy configuration guide" | ||
|
Alistair Coles
|
37ba21face |
Add structure to storage policy configuration guide
The description of storage policy config options was unstructured and repetitive. This patch attempts to improve the doc by gathering the notes for each option into a structured list. Change-Id: I57090b35a70f365e82fb0e29ab42e533d6359a7b |
||
|
Jenkins
|
b9322a2f08 | Merge "Add link from policies overview to per-policy proxy-server conf" | ||
|
Alistair Coles
|
227cef9933 |
Add link from policies overview to per-policy proxy-server conf
- add proxy server per policy config as an optional step in the configuration of a policy, with link to the deployment guide - add reverse link from deployment guide per-policy config doc section to storage policies docs Drive-by fix an incorrect test comment Change-Id: Ib95310193270a63c9d1e321c6e7de240e00b387f Related-Change: I3f718f425f525baa80045ba067950c752bcaaefc |
||
|
Tim Burke
|
d487bf7fb1 |
Remove tempauth docs from deployment guide
Instead, link to the middleware list and auth overview, as well as referring readers to proxy-server.conf-sample TempAuth-related content that was previously in the deployment guide has been moved to TempAuth's own docs, which have been cleaned up a bit. Change-Id: I00070bb09294362c069f7ee9426ac570bc1b3ddb |
||
|
Jenkins
|
263dc8a3f3 | Merge "Enable per policy proxy config options" | ||
|
Alistair Coles
|
45884c1102 |
Enable per policy proxy config options
This is an alternative approach to that proposed in [1] Adds support for optional per-policy config sections to be added in proxy-server.conf. This is highly desirable to allow per-policy affinity options to be set for use with duplicated EC policies [2] and composite rings [3]. Certain options found in per-policy conf sections will override their equivalents that may be set in the [app:proxy-server] section. Currently the options handled that way are: sorting_method read_affinity write_affinity write_affinity_node_count For example: [proxy-server:policy:0] sorting_method = affinity read_affinity = r1=100 write_affinity = r1 write_affinity_node_count = 1 * replicas The corresponding attributes of the proxy-server Application are now available from instances of an OverrideConf object that is obtained from Application.get_policy_options(policy). [1] Related-Change: I9104fc789ba85ab3ab5ccd34096125b482821389 [2] Related-Change: Idd155401982a2c48110c30b480966a863f6bd305 [3] Related-Change: I0d8928b55020592f8e75321d1f7678688301d797 Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp> Change-Id: I3f718f425f525baa80045ba067950c752bcaaefc |
||
|
Jenkins
|
a2e020c52b | Merge "Add read and write affinity options to deployment guide" | ||
|
Alistair Coles
|
f02ec4de81 |
Add read and write affinity options to deployment guide
Add entries for these options in the deployment guide and make the text in proxy-server.conf-sample and man page consistent. Change-Id: I5854ddb3e5864ddbeaf9ac2c930bfafdb47517c3 |
||
|
Jenkins
|
9089e44c0b | Merge "Add Composite Ring Functionality" | ||
|
Kota Tsuyuzaki
|
d40031b46f |
Add Composite Ring Functionality
* Adds a composite_builder module which provides the functionality to build a composite ring from a number of component ring builders. * Add id to RingBuilder to differentiate rings in composite. A RingBuilder now gets a UUID when it is saved to file if it does not already have one. A RingBuilder loaded from file does NOT get a UUID assigned unless it was previously persisted in the file. This forces users to explicitly assign an id to existing ring builders by saving the state back to file. The UUID is included in first line of the output from: swift-ring-builder <builder-file> Background: This is another implementation for Composite Ring [1] to enable better dispersion for global erasure coded cluster. The most significant difference from the related-change [1] is that this solution attempts to solve the problem as an offline tool rather than dynamic compositing on the running servers. Due to the change, we gain advantages such as: - Less code and being simple - No complex state validation on the running server - Easy deployments with an offline tool This patch does not provide a command line utility for managing composite rings. The interface for such a tool is still under discussion; this patch provides the enabling functionality first. Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> [1] Related-Change: I80ef36d3ac4d4b7c97a1d034b7fc8e0dc2214d16 Change-Id: I0d8928b55020592f8e75321d1f7678688301d797 |
||
|
Tim Burke
|
3981e1ee8a |
Remove links for EOLed releases
Change-Id: If7e8526edf18b02474ba272451b9b4212558e03c |
||
|
Clay Gerrard
|
37f6f25283 |
Update multi-node install links
... as is useful to do from time to time Change-Id: I165899445080fa3a8e6dc624ab5a13680b819a73 |
||
|
Ngo Quoc Cuong
|
e23f8d3160 |
Trivial fix typo while reading doc
Change-Id: I9d96dd4464a086e508fbf18057b4f3e90c82c916 |
||
|
Tim Burke
|
cce719260d |
Clean up some doc formatting
Change-Id: Iac24369910464cb766fe7d5e6c15120d147930a7 |
||
|
lijunbo
|
47ba1041fc |
Use swift tempurl instaed of swift-temp-url
Deprecate swift-temp-url and call python-swiftclient's implementation instead. This adds python-swiftclient as an optional dependency of Swift which is noted in releasenotes. Change-Id: I0404f16c21099cb7695430f5b63722729c305613 |
||
|
Jenkins
|
91fc844e7b | Merge "Document SAIO rsync service setup for ubuntu 16" | ||
|
lijunbo
|
21396bc106 |
keep consistent naming convention of swift and urls
Change-Id: Iddd4f69abf77a5c643ce8b164fc6cfd72c068229 |
||
|
Jenkins
|
b43414c905 | Merge "Accept storage_domain as a list in domain_remap" | ||
|
Jenkins
|
1e9b8888bf | Merge "Enable cluster-wide CORS Expose-Headers setting" | ||
|
Jenkins
|
a3efca5027 | Merge "Support EC policy for in process functional tests" | ||
|
Alistair Coles
|
5f610c76bd |
Support EC policy for in process functional tests
Add support for a 2+1 EC policy to be optionally used as default policy when running in process functional tests. The EC policy may be selected by setting the env var: SWIFT_TEST_IN_PROCESS_CONF_LOADER=ec tox when running .functests, or by using the new tox test env: tox -e func-ec Change-Id: I02e3553a74a024efdab91dcd609ac1cf4e4f3208 |
||
|
Monty Taylor
|
3c844d02b9 |
Replace references to swift.openstack.org
The policy of giving projects vanity domains stopped about 5 years ago. swift.openstack.org is a redirect to the canonical location - docs.openstack.org/developer/swift. While we are not aiming to remove the redirect any time in the forseeable future due to existing published links pointing to it, we should at the very least stop adding more of those links to the world. Change-Id: I10e92309f5d3b5f908fe4438f5cc0b184f161cba |
||
|
Alistair Coles
|
adcb4c270e |
Document SAIO rsync service setup for ubuntu 16
SAIO docs do suggest using Ubuntu 14.04, but if using 16.04 then systemctl needs to be used to have rsync service restart on reboot. Change-Id: I4fb0d3d063df61fbdfca981f06911148f3c4dc04 |
||
|
Clay Gerrard
|
38b99ad195 |
Global EC Under Development Documentation
Layout the foundation for documenting the features which will enable Global EC. The formatting on the sections in our existing EC docs didn't follow best practices [1] and it caused some sphinx build warnings. 1. http://www.sphinx-doc.org/en/stable/rest.html#sections Change-Id: I2d164dafeb84629c75c3c2ff774329ee84270b7f |
||
|
Tim Burke
|
2ca303597e |
Make Sphinx treat warnings as errors
...and fix up the one warning that's crept in. Change-Id: I3985d027f0ac2119ceaeb4daba5964f937de6cea |
||
|
Jenkins
|
1f36b5dd16 | Merge "EC Fragment Duplication - Foundational Global EC Cluster Support" | ||
|
Romain LE DISEZ
|
9b47de3095 |
Enable cluster-wide CORS Expose-Headers setting
An operator proposing a web UX to its customers might want to allow web browser to access some headers by default (eg: X-Storage-Policy, X-Container-Read, ...). This commit adds a new setting to the proxy-server to allow some headers to be added cluster-wide to the CORS header Access-Control-Expose-Headers. Change-Id: I5ca90a052f27c98a514a96ee2299bfa1b6d46334 |
||
|
Kota Tsuyuzaki
|
40ba7f6172 |
EC Fragment Duplication - Foundational Global EC Cluster Support
This patch enables efficent PUT/GET for global distributed cluster[1]. Problem: Erasure coding has the capability to decrease the amout of actual stored data less then replicated model. For example, ec_k=6, ec_m=3 parameter can be 1.5x of the original data which is smaller than 3x replicated. However, unlike replication, erasure coding requires availability of at least some ec_k fragments of the total ec_k + ec_m fragments to service read (e.g. 6 of 9 in the case above). As such, if we stored the EC object into a swift cluster on 2 geographically distributed data centers which have the same volume of disks, it is likely the fragments will be stored evenly (about 4 and 5) so we still need to access a faraway data center to decode the original object. In addition, if one of the data centers was lost in a disaster, the stored objects will be lost forever, and we have to cry a lot. To ensure highly durable storage, you would think of making *more* parity fragments (e.g. ec_k=6, ec_m=10), unfortunately this causes *significant* performance degradation due to the cost of mathmetical caluculation for erasure coding encode/decode. How this resolves the problem: EC Fragment Duplication extends on the initial solution to add *more* fragments from which to rebuild an object similar to the solution described above. The difference is making *copies* of encoded fragments. With experimental results[1][2], employing small ec_k and ec_m shows enough performance to store/retrieve objects. On PUT: - Encode incomming object with small ec_k and ec_m <- faster! - Make duplicated copies of the encoded fragments. The # of copies are determined by 'ec_duplication_factor' in swift.conf - Store all fragments in Swift Global EC Cluster The duplicated fragments increase pressure on existing requirements when decoding objects in service to a read request. All fragments are stored with their X-Object-Sysmeta-Ec-Frag-Index. In this change, the X-Object-Sysmeta-Ec-Frag-Index represents the actual fragment index encoded by PyECLib, there *will* be duplicates. Anytime we must decode the original object data, we must only consider the ec_k fragments as unique according to their X-Object-Sysmeta-Ec-Frag-Index. On decode no duplicate X-Object-Sysmeta-Ec-Frag-Index may be used when decoding an object, duplicate X-Object-Sysmeta-Ec-Frag-Index should be expected and avoided if possible. On GET: This patch inclues following changes: - Change GET Path to sort primary nodes grouping as subsets, so that each subset will includes unique fragments - Change Reconstructor to be more aware of possibly duplicate fragments For example, with this change, a policy could be configured such that swift.conf: ec_num_data_fragments = 2 ec_num_parity_fragments = 1 ec_duplication_factor = 2 (object ring must have 6 replicas) At Object-Server: node index (from object ring): 0 1 2 3 4 5 <- keep node index for reconstruct decision X-Object-Sysmeta-Ec-Frag-Index: 0 1 2 0 1 2 <- each object keeps actual fragment index for backend (PyEClib) Additional improvements to Global EC Cluster Support will require features such as Composite Rings, and more efficient fragment rebalance/reconstruction. 1: http://goo.gl/IYiNPk (Swift Design Spec Repository) 2: http://goo.gl/frgj6w (Slide Share for OpenStack Summit Tokyo) Doc-Impact Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Change-Id: Idd155401982a2c48110c30b480966a863f6bd305 |
||
|
Alistair Coles
|
9be1d8ba28 |
Fix tox -e docs sphinx errors
Change-Id: I6e200558b75ac539b59b492d13c36702443efc89 |
||
|
Romain LE DISEZ
|
5c93d6f238 |
Accept storage_domain as a list in domain_remap
Middleware domain_remap can work with cname_lookup middleware. This last middleware accept that storage_domain is a list of domains. To be consistent, domain_remap should have the same behavior. Closes-Bug: #1664647 Change-Id: Iacc6619968cc7c677bf63e0b8d101a20c86ce599 |
||
|
Anh Tran
|
1c4a16a3f9 |
Typo fix: curent => current
Change-Id: Ib7d2c16a755ae1faca5a371d5dae1e143110178f |
||
|
Nam Nguyen Hoai
|
44fc037f97 |
Fix_typo: "subsitute" -> "substitute"
There is a wrong word, it should be updated. Change-Id: I17a1fed844bcd8ab3ed668a34a8471e5f6032963 |
||
|
Tim Burke
|
13f1fc0885 |
Clean up EC overview docs a bit
Change-Id: I3bab2c015c63f32dcd6e4beefbcd0fcf22e91eec |
||
|
Alistair Coles
|
b5530f4620 |
Bring docs inline with changes to tox envs
Make development guidelines consistent with recent changes to tox envs, specifically the removal of "-in-process" from some test env names [1] and the removal of the "func-fast-post" test env [2]. [1] Related-Change: I02477d81b836df71780942189d37d616944c4dce [2] Related-Change: I6faf8fcfa0a1d96aaf0f5e0ad2106b2b416da22f Change-Id: I08b92c005ee50beff09a92b4331dd7dbeed79bde |
||
|
Jenkins
|
2914e04493 | Merge "ISO 8601 timestamps for tempurl" | ||
|
Jenkins
|
63b351893d | Merge "Default object_post_as_copy to False" | ||
|
Christopher Bartz
|
51727c531a |
ISO 8601 timestamps for tempurl
With this commit, the tempurl middleware accepts (besides the traditional unix timestamps) also timestamps according to the format '%Y-%m-%dT%H:%M:%SZ' (one acceptable form of ISO 8601). The idea is to make the tempurls more user-friendly, and has been formulated here: Change-Id: I346a0241060a9559d178b30e60c957792bbeb9f0 Implements: blueprint human-readable-tempurl-timestamp |
||
|
Thiago da Silva
|
1b7aabd75f |
remove reference to deprecated tool
Let's remove the reference to swift-temp-url since it has been deprecated and add a link to the swift client. Change-Id: I70d64bf90f23a0f48b238ae6a99ab86f87d028a1 Signed-off-by: Thiago da Silva <thiago@redhat.com> |
||
|
Tim Burke
|
4ee20dba48 |
Default object_post_as_copy to False
Additionally, emit deprecation warnings when running POST-as-COPY Change-Id: I11324e711057f7332577fd38f9bff82bdc6aac90 |