I have written these descriptions because I was unable to find them elsewhere on the
Internet, and I assume that they may be useful to someone. I give no guarantees. If you use
these descriptions, you also take full responsibility for all consequenses.
IMO Number
IMO = International Maritime Organization. IMO number identifies a commercial
passenger or cargo ship. Numbers are made up of letters IMO and seven decimal
digits. These numbers are given by IHS Fairplay, formerly Lloyd's Register -
Fairplay Ltd. The number is clearly marked on the side or stern of a cargo ship
and on the top of a passenger ship, and it is also used in the ship's documents.
The number remains the same throughout the ship's lifetime, regardless of
changes in the ship's name, structure or ownership. Once given, a number is
never reused by giving it to another ship. The last digit is a check digit.
- The digits to be checked are weighted from right to left by
2, 3, 4, 5, 6 and 7.
- Products are added up.
- The sum is divided by 10. The remainder (the last digit of the sum)
is the check digit.
Example: IMO 7625811 (Kristina Katarina, Kotka, Finland)
7 6 2 5 8 1 1
7 6 5 4 3 2
49 36 10 20 24 2 =わ 141 → 1
The method could also be described by saying that the weighting factors are 3..8
from left to right, and the check digit is the digit that you need to add to the
sum to make it evenly divisible by 10.
7 6 2 5 8 1 1
3 4 5 6 7 8
21 24 10 30 56 8 =わ 149 → 1
This check digit algorithm is not very good – it can fail to reveal a
change in a single digit, which is the most common keying error. For example,
76
05811 is a valid IMO number (of Antonio Gramši), but differs from that of
Kristina Katarina in a single digit position only. This would not happen in any
decent check digit system.
Notice! This description of the check digit algorithm is not based on any
published document. Instead, I have analysed more than 50 genuine IMO numbers.
No exceptions were found.
Coden
Coden is a six character identifier given by Chemical Abstracts Service (CAS) to
publications. There are two main types of identifiers:
- Serial publications: AAAADC, where AAAA is a mnemonic code derived from the
publication's name, D distinguishes publications which otherwise would have
identical codes and C is the check character.
- Nonserial publications: NNAAHC, where NNAA is the publication's identifier
and the last two characters are as above.
N represents a decimal digit 0..9, A and H represent a capital letter A..Z,
D represents a capital letter usually from the beginning of the alphabet (A..G),
C represents a decimal digit 2..9 or a capital letter A..Z.
Examples: JPERFA, 53AKAE
In some cases only the first five characters of Coden are given. Sometimes, in
place of Coden, another identifier beginning with 00 and without a check
character is used.
How to calculate the check character:
- Each character to be checked is replaced by a numeric value according to these tables:
Character
A B C D E F G
H I J K L M N
O P Q R S T U
V W X Y Z
Value
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26
Character
1 2 3 4 5
6 7 8 9 0
Value
27 28 29 30 31
32 33 34 35 36
Unlike in most check digit systems, even decimal digits are replaced by other values.
- These new values are multiplied by weights, which from left to right are: 11, 7, 5, 3,
and 1.
- The products are added together.
- The sum is divided by 34.
- If the remainder is 1..26, the check character is a letter which is looked up from the
previous table. Otherwise the check character is a decimal digit according to this table:
Remainder
27 28 29 30 31 32 33 0
Check character
2 3 4 5 6 7 8 9
Example 1: CYSTE3
C Y S T E 3
3 25 19 20 5
11 7 5 3 1
33 175 95 60 5 =わ 368 ≡ 28 (mod 34) → 3
Example 2: 48THAM
4 8 T H A M
30 34 20 8 1
11 7 5 3 1
330 238 100 24 1 =わ 693 ≡ 13 (mod 34) → M
In order to discover this method to calculate the check character, more than 2200 Codens
were analysed. Of these 20 (less than 1 %) do not match this description, but 19 of those
are obvious read errors, that is, someone has misread a character as something that looks or
sounds somewhat the same, and this incorrect character has then been typed. In 16 cases the
check character was incorrect, and in only 3 cases the code was in error, so I am tempted to
claim that this method itself is the cause of most of the errors it detects. The reason for
one error is not known; the code is quite different from the one in the actual publication.
The same erroneous Codens appear on many web pages, which suggests that they are copied
without checking. The correct versions can also be found. A list of the detected
incorrect Codens and their correct equivalents follows (other than check character errors
are underlined):
IncorrectCorrect
6
OWQAW 60WQAW
ACMCEL ACMCEI
BIEDDK BIEDDX
BJPCBH BJPCBM
CHEIDL CHEIDI
EJSCES DJSCES
GCKGEL GCKGEI
IncorrectCorrect
JCPMAE JPPCEJ
J
NSLD5 JMSLD5
LANGDS LANGD5
MBADEL MBADEI
MSICFS MSICF5
PLSCEH PLSCE4
SEKEAL SEKEAI
IncorrectCorrect
SENGAS SENGA5
TCYKES TCYKE5
ULTRDG ULTRD6
WTHPDL WTHPDI
YKXUDH YKXUD4
ZXXUES ZXXUE5
At
CAS Source Index (CASSI) I have
verified that all the Codens marked as correct in the above table are actually
in use.
This document is an extract from a considerably longer one (
in
Finnish), in which I describe more than twenty check character calculation methods and
more than fifty applications.