Super User's BSD Cross Reference: /NetBSD/share/man/man4/raid.4

1 .\" $NetBSD: raid.4,v 1.41 2021年10月04日 14:35:20 andvar Exp $
2 .\"
3 .\" Copyright (c) 1998 The NetBSD Foundation, Inc.
4 .\" All rights reserved.
5 .\"
6 .\" This code is derived from software contributed to The NetBSD Foundation
7 .\" by Greg Oster
8 .\"
9 .\" Redistribution and use in source and binary forms, with or without
10 .\" modification, are permitted provided that the following conditions
11 .\" are met:
12 .\" 1. Redistributions of source code must retain the above copyright
13 .\" notice, this list of conditions and the following disclaimer.
14 .\" 2. Redistributions in binary form must reproduce the above copyright
15 .\" notice, this list of conditions and the following disclaimer in the
16 .\" documentation and/or other materials provided with the distribution.
17 .\"
18 .\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
19 .\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
20 .\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
21 .\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
22 .\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
23 .\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
24 .\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
25 .\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
26 .\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
27 .\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
28 .\" POSSIBILITY OF SUCH DAMAGE.
29 .\"
30 .\"
31 .\" Copyright (c) 1995 Carnegie-Mellon University.
32 .\" All rights reserved.
33 .\"
34 .\" Author: Mark Holland
35 .\"
36 .\" Permission to use, copy, modify and distribute this software and
37 .\" its documentation is hereby granted, provided that both the copyright
38 .\" notice and this permission notice appear in all copies of the
39 .\" software, derivative works or modified versions, and any portions
40 .\" thereof, and that both notices appear in supporting documentation.
41 .\"
42 .\" CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS "AS IS"
43 .\" CONDITION. CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND
44 .\" FOR ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE.
45 .\"
46 .\" Carnegie Mellon requests users of this software to return to
47 .\"
48 .\" Software Distribution Coordinator or Software.Distribution@CS.CMU.EDU
49 .\" School of Computer Science
50 .\" Carnegie Mellon University
51 .\" Pittsburgh PA 15213-3890
52 .\"
53 .\" any improvements or extensions that they make and grant Carnegie the
54 .\" rights to redistribute these changes.
55 .\"
56 .Dd May 26, 2021
57 .Dt RAID 4
58 .Os
59 .Sh NAME
60 .Nm raid
61 .Nd RAIDframe disk driver
62 .Sh SYNOPSIS
63 .Cd options RAID_AUTOCONFIG
64 .Cd options RAID_DIAGNOSTIC
65 .Cd options RF_ACC_TRACE=n
66 .Cd options RF_DEBUG_MAP=n
67 .Cd options RF_DEBUG_PSS=n
68 .Cd options RF_DEBUG_QUEUE=n
69 .Cd options RF_DEBUG_QUIESCE=n
70 .Cd options RF_DEBUG_RECON=n
71 .Cd options RF_DEBUG_STRIPELOCK=n
72 .Cd options RF_DEBUG_VALIDATE_DAG=n
73 .Cd options RF_DEBUG_VERIFYPARITY=n
74 .Cd options RF_INCLUDE_CHAINDECLUSTER=n
75 .Cd options RF_INCLUDE_EVENODD=n
76 .Cd options RF_INCLUDE_INTERDECLUSTER=n
77 .Cd options RF_INCLUDE_PARITY_DECLUSTERING=n
78 .Cd options RF_INCLUDE_PARITY_DECLUSTERING_DS=n
79 .Cd options RF_INCLUDE_PARITYLOGGING=n
80 .Cd options RF_INCLUDE_RAID5_RS=n
81 .Pp
82 .Cd pseudo-device raid
83 .Sh DESCRIPTION
84The
85 .Nm
86driver provides RAID 0, 1, 4, and 5 (and more!) capabilities to
87 .Nx .
88This
89document assumes that the reader has at least some familiarity with RAID
90and RAID concepts.
91The reader is also assumed to know how to configure
92disks and pseudo-devices into kernels, how to generate kernels, and how
93to partition disks.
94 .Pp
95RAIDframe provides a number of different RAID levels including:
96 .Bl -tag -width indent
97 .It RAID 0
98provides simple data striping across the components.
99 .It RAID 1
100provides mirroring.
101 .It RAID 4
102provides data striping across the components, with parity
103stored on a dedicated drive (in this case, the last component).
104 .It RAID 5
105provides data striping across the components, with parity
106distributed across all the components.
107 .El
108 .Pp
109There are a wide variety of other RAID levels supported by RAIDframe.
110The configuration file options to enable them are briefly outlined
111at the end of this section.
112 .Pp
113Depending on the parity level configured, the device driver can
114support the failure of component drives.
115The number of failures
116allowed depends on the parity level selected.
117If the driver is able
118to handle drive failures, and a drive does fail, then the system is
119operating in "degraded mode".
120In this mode, all missing data must be
121reconstructed from the data and parity present on the other
122components.
123This results in much slower data accesses, but
124does mean that a failure need not bring the system to a complete halt.
125 .Pp
126The RAID driver supports and enforces the use of
127 .Sq component labels .
128A
129 .Sq component label
130contains important information about the component, including a
131user-specified serial number, the row and column of that component in
132the RAID set, and whether the data (and parity) on the component is
133 .Sq clean .
134The component label currently lives at the half-way point of the
135 .Sq reserved section
136located at the beginning of each component.
137This
138 .Sq reserved section
139is RF_PROTECTED_SECTORS in length (64 blocks or 32Kbytes) and the
140component label is currently 1Kbyte in size.
141 .Pp
142If the driver determines that the component labels are very inconsistent with
143respect to each other (e.g. two or more serial numbers do not match)
144or that the component label is not consistent with its assigned place
145in the set (e.g. the component label claims the component should be
146the 3rd one in a 6-disk set, but the RAID set has it as the 3rd component
147in a 5-disk set) then the device will fail to configure.
148If the
149driver determines that exactly one component label seems to be
150incorrect, and the RAID set is being configured as a set that supports
151a single failure, then the RAID set will be allowed to configure, but
152the incorrectly labeled component will be marked as
153 .Sq failed ,
154and the RAID set will begin operation in degraded mode.
155If all of the components are consistent among themselves, the RAID set
156will configure normally.
157 .Pp
158Component labels are also used to support the auto-detection and
159autoconfiguration of RAID sets.
160A RAID set can be flagged as
161autoconfigurable, in which case it will be configured automatically
162during the kernel boot process.
163RAID file systems which are
164automatically configured are also eligible to be the root file system.
165There is currently only limited support (alpha, amd64, i386, pmax,
166sparc, sparc64, and vax architectures)
167for booting a kernel directly from a RAID 1 set, and no support for
168booting from any other RAID sets.
169To use a RAID set as the root
170file system, a kernel is usually obtained from a small non-RAID
171partition, after which any autoconfiguring RAID set can be used for the
172root file system.
173See
174 .Xr raidctl 8
175for more information on autoconfiguration of RAID sets.
176Note that with autoconfiguration of RAID sets, it is no longer
177necessary to hard-code SCSI IDs of drives.
178The autoconfiguration code will
179correctly configure a device even after any number of the components
180have had their device IDs changed or device names changed.
181 .Pp
182The driver supports
183 .Sq hot spares ,
184disks which are on-line, but are not
185actively used in an existing file system.
186Should a disk fail, the
187driver is capable of reconstructing the failed disk onto a hot spare
188or back onto a replacement drive.
189If the components are hot swappable, the failed disk can then be
190removed, a new disk put in its place, and a copyback operation
191performed.
192The copyback operation, as its name indicates, will copy
193the reconstructed data from the hot spare to the previously failed
194(and now replaced) disk.
195Hot spares can also be hot-added using
196 .Xr raidctl 8 .
197 .Pp
198If a component cannot be detected when the RAID device is configured,
199that component will be simply marked as 'failed'.
200 .Pp
201The user-land utility for doing all
202 .Nm
203configuration and other operations
204is
205 .Xr raidctl 8 .
206Most importantly,
207 .Xr raidctl 8
208must be used with the
209 .Fl i
210option to initialize all RAID sets.
211In particular, this
212initialization includes re-building the parity data.
213This rebuilding
214of parity data is also required when either a) a new RAID device is
215brought up for the first time or b) after an un-clean shutdown of a
216RAID device.
217By using the
218 .Fl P
219option to
220 .Xr raidctl 8 ,
221and performing this on-demand recomputation of all parity
222before doing a
223 .Xr fsck 8
224or a
225 .Xr newfs 8 ,
226file system integrity and parity integrity can be ensured.
227It bears repeating again that parity recomputation is
228 .Ar required
229before any file systems are created or used on the RAID device.
230If the
231parity is not correct, then missing data cannot be correctly recovered.
232 .Pp
233RAID levels may be combined in a hierarchical fashion.
234For example, a RAID 0
235device can be constructed out of a number of RAID 5 devices (which, in turn,
236may be constructed out of the physical disks, or of other RAID devices).
237 .Pp
238The first step to using the
239 .Nm
240driver is to ensure that it is suitably configured in the kernel.
241This is done by adding a line similar to:
242 .Bd -unfilled -offset indent
243pseudo-device raid # RAIDframe disk device
244 .Ed
245 .Pp
246to the kernel configuration file.
247The RAIDframe drivers are configured dynamically as needed.
248To turn on component auto-detection and autoconfiguration of RAID
249sets, simply add:
250 .Bd -unfilled -offset indent
251options RAID_AUTOCONFIG
252 .Ed
253 .Pp
254to the kernel configuration file.
255 .Pp
256All component partitions must be of the type
257 .Dv FS_BSDFFS
258(e.g. 4.2BSD) or
259 .Dv FS_RAID .
260The use of the latter is strongly encouraged, and is required if
261autoconfiguration of the RAID set is desired.
262Since RAIDframe leaves
263room for disklabels, RAID components can be simply raw disks, or
264partitions which use an entire disk.
265 .Pp
266A more detailed treatment of actually using a
267 .Nm
268device is found in
269 .Xr raidctl 8 .
270It is highly recommended that the steps to reconstruct, copyback, and
271re-compute parity are well understood by the system administrator(s)
272 .Ar before
273a component failure.
274Doing the wrong thing when a component fails may
275result in data loss.
276 .Pp
277Additional internal consistency checking can be enabled by specifying:
278 .Bd -unfilled -offset indent
279options RAID_DIAGNOSTIC
280 .Ed
281 .Pp
282These assertions are disabled by default in order to improve
283performance.
284 .Pp
285RAIDframe supports an access tracing facility for tracking both
286requests made and performance of various parts of the RAID systems
287as the request is processed.
288To enable this tracing the following option may be specified:
289 .Bd -unfilled -offset indent
290options RF_ACC_TRACE=1
291 .Ed
292 .Pp
293For extensive debugging there are a number of kernel options which
294will aid in performing extra diagnosis of various parts of the
295RAIDframe sub-systems.
296Note that in order to make full use of these options it is often
297necessary to enable one or more debugging options as listed in
298 .Pa src/sys/dev/raidframe/rf_options.h .
299As well, these options are also only typically useful for people who wish
300to debug various parts of RAIDframe.
301The options include:
302 .Pp
303For debugging the code which maps RAID addresses to physical
304addresses:
305 .Bd -unfilled -offset indent
306options RF_DEBUG_MAP=1
307 .Ed
308 .Pp
309Parity stripe status debugging is enabled with:
310 .Bd -unfilled -offset indent
311options RF_DEBUG_PSS=1
312 .Ed
313 .Pp
314Additional debugging for queuing is enabled with:
315 .Bd -unfilled -offset indent
316options RF_DEBUG_QUEUE=1
317 .Ed
318 .Pp
319Problems with non-quiescent file systems should be easier to debug if
320the following is enabled:
321 .Bd -unfilled -offset indent
322options RF_DEBUG_QUIESCE=1
323 .Ed
324 .Pp
325Stripelock debugging is enabled with:
326 .Bd -unfilled -offset indent
327options RF_DEBUG_STRIPELOCK=1
328 .Ed
329 .Pp
330Additional diagnostic checks during reconstruction are enabled with:
331 .Bd -unfilled -offset indent
332options RF_DEBUG_RECON=1
333 .Ed
334 .Pp
335Validation of the DAGs (Directed Acyclic Graphs) used to describe an
336I/O access can be performed when the following is enabled:
337 .Bd -unfilled -offset indent
338options RF_DEBUG_VALIDATE_DAG=1
339 .Ed
340 .Pp
341Additional diagnostics during parity verification are enabled with:
342 .Bd -unfilled -offset indent
343options RF_DEBUG_VERIFYPARITY=1
344 .Ed
345 .Pp
346There are a number of less commonly used RAID levels supported by
347RAIDframe.
348These additional RAID types should be considered experimental, and
349may not be ready for production use.
350The various types and the options to enable them are shown here:
351 .Pp
352For Even-Odd parity:
353 .Bd -unfilled -offset indent
354options RF_INCLUDE_EVENODD=1
355 .Ed
356 .Pp
357For RAID level 5 with rotated sparing:
358 .Bd -unfilled -offset indent
359options RF_INCLUDE_RAID5_RS=1
360 .Ed
361 .Pp
362For Parity Logging (highly experimental):
363 .Bd -unfilled -offset indent
364options RF_INCLUDE_PARITYLOGGING=1
365 .Ed
366 .Pp
367For Chain Declustering:
368 .Bd -unfilled -offset indent
369options RF_INCLUDE_CHAINDECLUSTER=1
370 .Ed
371 .Pp
372For Interleaved Declustering:
373 .Bd -unfilled -offset indent
374options RF_INCLUDE_INTERDECLUSTER=1
375 .Ed
376 .Pp
377For Parity Declustering:
378 .Bd -unfilled -offset indent
379options RF_INCLUDE_PARITY_DECLUSTERING=1
380 .Ed
381 .Pp
382For Parity Declustering with Distributed Spares:
383 .Bd -unfilled -offset indent
384options RF_INCLUDE_PARITY_DECLUSTERING_DS=1
385 .Ed
386 .Pp
387The reader is referred to the RAIDframe documentation mentioned in the
388 .Sx HISTORY
389section for more detail on these various RAID configurations.
390 .Sh WARNINGS
391Certain RAID levels (1, 4, 5, 6, and others) can protect against some
392data loss due to component failure.
393However the loss of two
394components of a RAID 4 or 5 system, or the loss of a single component
395of a RAID 0 system, will result in the entire file systems on that RAID
396device being lost.
397RAID is
398 .Ar NOT
399a substitute for good backup practices.
400 .Pp
401Recomputation of parity
402 .Ar MUST
403be performed whenever there is a chance that it may have been
404compromised.
405This includes after system crashes, or before a RAID
406device has been used for the first time.
407Failure to keep parity
408correct will be catastrophic should a component ever fail \(em it is
409better to use RAID 0 and get the additional space and speed, than it
410is to use parity, but not keep the parity correct.
411At least with RAID
4120 there is no perception of increased data security.
413 .Sh FILES
414 .Bl -tag -width /dev/XXrXraidX -compact
415 .It Pa /dev/{,r}raid*
416 .Nm
417device special files.
418 .El
419 .Sh SEE ALSO
420 .Xr config 1 ,
421 .Xr sd 4 ,
422 .Xr fsck 8 ,
423 .Xr MAKEDEV 8 ,
424 .Xr mount 8 ,
425 .Xr newfs 8 ,
426 .Xr raidctl 8
427 .Sh HISTORY
428The
429 .Nm
430driver in
431 .Nx
432is a port of RAIDframe, a framework for rapid prototyping of RAID
433structures developed by the folks at the Parallel Data Laboratory at
434Carnegie Mellon University (CMU).
435RAIDframe, as originally distributed
436by CMU, provides a RAID simulator for a number of different
437architectures, and a user-level device driver and a kernel device
438driver for Digital Unix.
439The
440 .Nm
441driver is a kernelized version of RAIDframe v1.1.
442 .Pp
443A more complete description of the internals and functionality of
444RAIDframe is found in the paper "RAIDframe: A Rapid Prototyping Tool
445for RAID Systems", by William V. Courtright II, Garth Gibson, Mark
446Holland, LeAnn Neal Reilly, and Jim Zelenka, and published by the
447Parallel Data Laboratory of Carnegie Mellon University.
448The
449 .Nm
450driver first appeared in
451 .Nx 1.4 .
452 .Pp
453RAIDframe was ported to
454 .Nx
455by Greg Oster in 1998, who has maintained it since.
456In 1999, component labels, spares, automatic rebuilding of parity, and
457autoconfiguration of volumes were added.
458In 2000, root on RAID support was added (initially, with no support for
459loading kernels from RAID volumes, which has been added to many ports since.)
460In 2009, support for parity bimap was added, reducing parity resync time
461after a crash.
462In 2010, support for larger than 2TiB and non-512 sector devices was added.
463In 2018, support for 32-bit userland compatibility was added.
464In 2021, support for autoconfiguration from other-endian raid sets was added.
465 .Pp
466Support for loading kernels from RAID 1 partitions was added for the
467pmax, alpha, i386, and vax ports in 2000, the sgimips port in 2001,
468the sparc64 and amd64 ports in 2002, the arc port in 2005, the sparc,
469and landisk ports in 2006, the cobalt port in 2007, the ofppc port in 2008,
470the bebox port in 2010, the emips port in 2011, and the sandpoint port
471in 2012.
472 .Sh COPYRIGHT
473 .Bd -unfilled
474The RAIDframe Copyright is as follows:
475 .Pp
476Copyright (c) 1994-1996 Carnegie-Mellon University.
477All rights reserved.
478 .Pp
479Permission to use, copy, modify and distribute this software and
480its documentation is hereby granted, provided that both the copyright
481notice and this permission notice appear in all copies of the
482software, derivative works or modified versions, and any portions
483thereof, and that both notices appear in supporting documentation.
484 .Pp
485CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS "AS IS"
486CONDITION. CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND
487FOR ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE.
488 .Pp
489Carnegie Mellon requests users of this software to return to
490 .Pp
491 Software Distribution Coordinator or Software.Distribution@CS.CMU.EDU
492 School of Computer Science
493 Carnegie Mellon University
494 Pittsburgh PA 15213-3890
495 .Pp
496any improvements or extensions that they make and grant Carnegie the
497rights to redistribute these changes.
498 .Ed
499 

AltStyle によって変換されたページ (->オリジナル) /