By Roy Amodeo, Senior Software Architect
OmniMark has undergone a tremendous evolution from its earliest days as a very simple rule-based SGML scripting language, to its current state as a general-purpose programming language with modern software-engineering features.
During the course of this evolution, great pains have been taken to maintain backwards compatibility. Still, some changes to the core language have been necessary. The requirements of a general-purpose programming language, suitable for engineering complex high-performance systems, are very different from those of a simple narrowly-targeted scripting language.
Even so, many older programs do not need to be modified to work with current versions of OmniMark. The only programs that do need modification are those that are written with a version 2 (v2) coding style.
For most programs that do require modifications, they can be updated automatically in only a few seconds, using the migration script provided with this article. A few programs will require additional hand-editing, generally taking just a few minutes more.
This article will walk you through the migration process, and help you troubleshoot those few programs that require further modification.
This process can be run with any version of OmniMark from version 6 to the current version, and the results will be compatible with all of these releases.
You can download the conversion scripts mentioned here in zip format for Windows. These scripts are provided in source code form to make it easier for you to customize them for your particular code base.
This article assumes the use of the OmniMark Studio for Eclipse. In versions 8 and above, this will be the OmniMark Studio for Eclipse. In version 7, you might have either the Studio for Eclipse or the standalone Studio. In version 6 you will be using the standalone version of Studio. The program provided will work in all of these versions. The procedures for running the program vary slightly. The steps for using Studio for Eclipse are listed first, and the steps for standalone Studio follow in a grey box. If you have a lot of files to convert, you may wish to compile the migration programs so that they can be run from a batch script. See the OmniMark Studio documentation for more information on compiling programs, and the OmniMark Engine documentation for information about running compiled programs.
If you are working with OmniMark version 6 in a Unix environment, you will have to compile the migration programs and use an OmniMark Engine to execute them.
This step will take a few seconds for each file you need to migrate.
Procedure for OmniMark Studio for Eclipse
Run the to-six.xom program to upgrade the syntax.
include=C:\MyPrograms\OmniMark\xin-of newfile.xom
-log logfile.xom
-warning-ignore deprecated here, to avoid seeing numerous warnings about obsolete syntax. The obsolete syntax has been retained in this program so that it continues to work in versions 6 and 7.Procedure for standalone Studio (OmniMark 6)
to-six.xop project to upgrade the syntax.
to-six.xop.
include=C:\MyPrograms\OmniMark\xin
;pre-6". Spaces are
added to the front of the rest of the lines to keep
everything lined up. (These extra spaces and comment
lines will be removed in a later step.)
;pre-6".
If you see any warnings, you should examine the referenced lines in the output file and make sure that the code is correct. If not, you should correct the output file by hand.
Once you have finished converting your program files and your include files, you should try running the programs with the newer version of OmniMark.
Procedure for OmniMark Studio for Eclipse
Procedure for standalone Studio (OmniMark 6)
Create a project file for each command-line:
At this point you should have very few syntax errors if any. Correct any of the remaining errors by hand, and then run your programs again. Make sure they produce the same results as your old programs under your old version of OmniMark.
If you have any problems, or just want to understand this process better, see Appendix C: What the Migration Process Does.
At this point, you have successfully upgraded your programs to work with the newest versions of OmniMark. Now you probably want to get rid of the comment lines added during this process.
This step will take a few seconds for each file you are upgrading.
Procedure for OmniMark Studio for Eclipse:
Procedure for standalone Studio (OmniMark 6)
clean-six.xop.
to-six.xop.
You can compare this output file to your original (pre-OmniMark 6) file with any line-by-line comparison utility. You will see that the only changes are the ones necessary to upgrade the syntax.
Prior to 5.3, OmniMark allowed you to quote your variable
names if you preceded them with a herald (or a type-specific
keyword like active or increment).
Without heralding, quoted variable names are indistinguishable from quoted text strings. For this reason, this feature was dropped in version 5.3.
OmniMark 7 re-introduces quoted names to support prefixing of symbolic operators. OmniMark from version 7 on uses a different syntax for quoted names so that they cannot be confused with text strings. In these versions, a quoted name must be wrapped in either #"..." or #'...'.
Programs which use quoted variable names will be automatically migrated to the OmniMark 7 syntax. These programs will be compatible with all newer releases, but they will not be compatible with OmniMark 6 without hand modification.
Another potential problem is that a global variable declaration can "hide" a keyword. If your program did not have any variable declarations, then a variable reference always had to be preceded by a herald or a keyword that acted like a herald. So OmniMark was always able to tell the difference between your variables and language keywords.
From OmniMark V3 to OmniMark 5.2, you could use or omit the herald, as you wished, as long as you declared all of your variables. From OmniMark 5.3 on, variables were always referenced without a type herald.
When the variable name is the same as a keyword, OmniMark sometimes can't tell which you mean. If this occurs, you may get an error message like:
omnimark -- OmniMark Error xxxx on line 1179 in file my-prog-1.xom: Syntax Error. ... The keyword 'SDATA' wasn't recognized because there is a variable, function, or opaque type with the same name.
You can fix these types of errors quickly by doing a search and replace. Make sure you change only the variable references, though, and not the keywords too.
This section briefly describes some of the transformations
that the upgrade program (to-six) does.
The first step of the migration process determines whether global variables must be declared, and if so, generates them.
It does this by reading the program file, and all of the files that it includes. If there are no global variable declarations already, but variable references are detected, then a list of global variables will be generated and inserted at the beginning of the program file.
Global variable declarations are not generated for include files, because they would duplicate the ones generated for the main programs.
At this time, the keyword "down-translate" will also be placed at the top of the program if it is needed.
One thing the program does is correct the use of the equals symbol (=).
Before OmniMark V3, the "=" symbol was only used for pattern assignment. V3 introduced a new symbol for pattern assignment (=>) and used "=" for comparisons. However, the use of "=" for pattern assignment was still supported for backwards compatibility.
Needless to say, you shouldn't use the same symbol to mean different things. Since version 5.3, OmniMark issues warning messages wherever the "=" symbol is used for pattern assignment, with a view towards eventually removing this use from the language.
equalize.xin contains a function that looks for
solitary "=" symbols in your program and converts them either
to "is equal" (the old form of the equality comparison)
or to "=>" (the new form of the pattern assignment operator).
Either way, ambiguity is eliminated at this stage. The
"is equal" construct will be changed back to "="
in a later phase.
Removing type heralds is the final and most extensive part of
the process. This is done by a function in deherald.xin.
In addition to removing heralds, this step also replaces some deprecated constructs with their modern equivalents. This includes:
set counter" and "reset"
to "set"
set buffer" and "set stream"
to "set"
counter", "stream", and
"switch" everywhere except in variable declarations
and" form of variable declarations
to a sequence of declarations. Omnimark allows syntax like:
local switch x and counter y
This is converted to:
local switch x local counter y
is/isnt equal", "is/isnt
greater-than", "is/isnt less-than")
to the symbolic forms
sgml"
and "output" to "#markup-parser" and
"#main-output"
pattern" and
"another"
You may find that this step results in messages like:
WARNING: Quoted variable name (stream "my-var") - replacing with v7 syntax.
That means that "my-var" may be a quoted variable name here. The variable will be changed to use the OmniMark 7 syntax for quoting names (#"my-var").
You may wish to examine the modified lines of code, and make sure that it really is a variable reference. If you are migrating to OmniMark 6, you will have to remove the "#" character and the quotes.
When you are migrating to OmniMark 6, make sure that the unquoted name is legal. It must begin with a letter or a character whose numeric value is between 128 and 255, and the subsequent characters must be either one of those, a digit, or a period (.), hyphen (-), or underscore (_). Any other characters must either be replaced or removed.
You will have to be careful with quoted variable names inside of macros. The sequence "%@(...)" in a quoted variable name means that a macro argument is being spliced into the name at that point.
Using macro arguments to build variable names was one way of simulating structures in early OmniMark programs. Now, a better way is simply to use keys to simulate field referencing.
In any event, you cannot use macros this way in OmniMark 6. The best way to correct this is to pass in the complete list of variable names that the macro operates on, instead of just passing in a piece of a variable name.
Finally, with the removal of heralds, there is one other problem area that needs to be dealt with.
In most languages, when you define a variable in a local scope with the same name as a variable in the outer scope, the inner variable hides the outer one. In OmniMark, prior to 5.3, you could still reference the outer variable by heralding it, provided it had a different type than the inner one.
Usually, the only time a name is reused in a program is when one of the variables has a very short lifespan, only being used to capture a value and transfer it to the final destination variable, and the programmer uses the same name because it's easy:
find digit+ => id ":" any-text+ => value "%n" local counter id set id to pattern id ...
This can be easily corrected by changing the name of the pattern variable.
The file finddup.xin contains a function that can
detect some of these variable name reuses. It also attempts to
warn about variables declared with the same name as another
variable visible in the same scope.
These checks are heuristic, and can be fooled by macros, or by declaring the variables in one file, and using them in another. However, these checks should find many of the common cases.