Jump to content
Wikipedia The Free Encyclopedia

Bytecode

From Wikipedia, the free encyclopedia
(Redirected from Intermediate code)
Form of instruction set designed to be run by a software interpreter
"Portable code" and "P-code" redirect here. For other uses, see software portability and P-Code (disambiguation).
This article needs additional citations for verification . Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Bytecode" – news · newspapers · books · scholar · JSTOR
(January 2009) (Learn how and when to remove this message)
Program execution
General concepts
Types of code
Compilation strategies
Notable runtimes
Notable compilers & toolchains

Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable [1] source code, bytecodes are compact numeric codes, constants, and references (normally numeric addresses) that encode the result of compiler parsing and performing semantic analysis of things like type, scope, and nesting depths of program objects.

The name bytecode stems from instruction sets that have one-byte opcodes followed by optional parameters. Intermediate representations such as bytecode may be output by programming language implementations to ease interpretation, or it may be used to reduce hardware and operating system dependence by allowing the same code to run cross-platform, on different devices. Bytecode may often be either directly executed on a virtual machine (a p-code machine, i.e., interpreter), or it may be further compiled into machine code for better performance.

Since bytecode instructions are processed by software, they may be arbitrarily complex, but are nonetheless often akin to traditional hardware instructions: virtual stack machines are the most common, but virtual register machines have been built also.[2] [3] Different parts may often be stored in separate files, similar to object modules, but dynamically loaded during execution.

Execution

[edit ]

A bytecode program may be executed by parsing and directly executing the instructions, one at a time. This kind of bytecode interpreter is very portable. Some systems, called dynamic translators, or just-in-time (JIT) compilers, translate bytecode into machine code as necessary at runtime. This makes the virtual machine hardware-specific but does not lose the portability of the bytecode. For example, Java and Smalltalk code is typically stored in bytecode format, which is typically then JIT compiled to translate the bytecode to machine code before execution. This introduces a delay before a program is run, when the bytecode is compiled to native machine code, but improves execution speed considerably compared to interpreting source code directly, normally by around an order of magnitude (10x).[4]

Because of its performance advantage, today many language implementations execute a program in two phases, first compiling the source code into bytecode, and then passing the bytecode to the virtual machine. There are bytecode based virtual machines of this sort for Java, Raku, Python, PHP,[a] Tcl, mawk and Forth (however, Forth is seldom compiled via bytecodes in this way, and its virtual machine is more generic instead). The implementation of Perl and Ruby 1.8 instead work by walking an abstract syntax tree representation derived from the source code.

More recently, the authors of V8 [1] and Dart [7] have challenged the notion that intermediate bytecode is needed for fast and efficient VM implementation. Both of these language implementations currently do direct JIT compiling from source code to machine code with no bytecode intermediary.[8]

Examples

[edit ]
(disassemble'(lambda(x)(printx)))
; disassembly for (LAMBDA (X))
; 2436F6DF: 850500000F22 TEST EAX, [#x220F0000]  ; no-arg-parsing entry point
; E5: 8BD6 MOV EDX, ESI
; E7: 8B05A8F63624 MOV EAX, [#x2436F6A8]  ; #<FDEFINITION object for PRINT>
; ED: B904000000 MOV ECX, 4
; F2: FF7504 PUSH DWORD PTR [EBP+4]
; F5: FF6005 JMP DWORD PTR [EAX+5]
; F8: CC0A BREAK 10  ; error trap
; FA: 02 BYTE #X02
; FB: 18 BYTE #X18  ; INVALID-ARG-COUNT-ERROR
; FC: 4F BYTE #X4F  ; ECX
Compiled code can be analysed and investigated using a built-in tool for debugging the low-level bytecode. The tool can be initialized from the shell, for example:
>>> importdis # "dis" - Disassembler of Python byte code into mnemonics.
>>> dis.dis('print("Hello, World!")')
 1 0 LOAD_NAME 0 (print)
 2 LOAD_CONST 0 ('Hello, World!')
 4 CALL_FUNCTION 1
 6 RETURN_VALUE

See also

[edit ]
Look up bytecode in Wiktionary, the free dictionary.

Notes

[edit ]
  1. ^ PHP has just-in-time compilation in PHP 8,[5] [6] and before while not on in the default version, had options like HHVM. For older versions of PHP: Although PHP opcodes are generated each time the program is launched, and are always interpreted and not just-in-time compiled.

References

[edit ]
  1. ^ a b "Dynamic Machine Code Generation". Google Inc. Archived from the original on 2017年03月05日. Retrieved 2024年12月01日.
  2. ^ "The Implementation of Lua 5.0". (NB. This involves a register-based virtual machine.)
  3. ^ "Dalvik VM". Archived from the original on 2013年05月18日. Retrieved 2012年10月29日. (NB. This VM is register based.)
  4. ^ "Byte Code Vs Machine Code". www.allaboutcomputing.net. Retrieved 2017年10月23日.
  5. ^ O’Phinney, Matthew Weier. "Exploring the New PHP JIT Compiler". Zend by Perforce. Retrieved 2021年02月19日.
  6. ^ "PHP 8: The JIT - stitcher.io". stitcher.io. Retrieved 2021年02月19日.
  7. ^ Loitsch, Florian. "Why Not a Bytecode VM?". Google. Archived from the original on 2013年05月12日.
  8. ^ "JavaScript myth: JavaScript needs a standard bytecode". 2ality.com.
  9. ^ G., Adam Y. (2022年07月11日). "Berkeley Pascal". GitHub . Retrieved 2022年01月08日.
  10. ^ "CLHS: Function DISASSEMBLE". www.lispworks.com.
  11. ^ Collective (2023年12月13日). "The Common Lisp Cookbook – Performance Tuning and Tips". lispcookbook.github.io.
  12. ^ "The Implementation of the Icon Programming Language" (PDF). Archived from the original (PDF) on 2016年03月05日. Retrieved 2011年09月09日.
  13. ^ "The Implementation of Icon and Unicon a Compendium" (PDF). Archived (PDF) from the original on 2022年10月09日.
  14. ^ Paul, Matthias R. (2001年12月30日). "KEYBOARD.SYS internal structure". Newsgroupcomp.os.msdos.programmer. Archived from the original on 2017年09月09日. Retrieved 2016年09月17日. [...] In fact, the format is basically the same in MS-DOS 3.3 - 8.0, PC DOS 3.3 - 2000, including Russian, Lithuanian, Chinese and Japanese issues, as well as in Windows NT, 2000, and XP [...]. There are minor differences and incompatibilities, but the general format has not changed over the years. [...] Some of the data entries contain normal tables [...] However, most entries contain executable code interpreted by some kind of p-code interpreter at *runtime*, including conditional branches and the like. This is why the KEYB driver has such a huge memory footprint compared to table-driven keyboard drivers which can be done in 3 - 4 Kb getting the same level of function except for the interpreter. [...]
  15. ^ Mendelson, Edward (2001年07月20日). "How to Display the Euro in MS-DOS and Windows DOS". Display the euro symbol in full-screen MS-DOS (including Windows 95 or Windows 98 full-screen DOS). Archived from the original on 2016年09月17日. Retrieved 2016年09月17日. [...] Matthias [R.] Paul [...] warns that the IBM PC DOS version of the keyboard driver uses some internal procedures that are not recognized by the Microsoft driver, so, if possible, you should use the IBM versions of both KEYB.COM and KEYBOARD.SYS instead of mixing Microsoft and IBM versions [...] (NB. What is meant by "procedures" here are some additional bytecodes in the IBM KEYBOARD.SYS file not supported by the Microsoft version of the KEYB driver.)
  16. ^ "United States Patent 6,973,644". Archived from the original on 2017年03月05日. Retrieved 2009年05月21日.
  17. ^ Microsoft C Pcode Specifications. p. 13. Multiplan wasn't compiled to machine code, but to a kind of byte-code which was run by an interpreter, in order to make Multiplan portable across the widely varying hardware of the time. This byte-code distinguished between the machine-specific floating point format to calculate on, and an external (standard) format, which was binary coded decimal (BCD). The PACK and UNPACK instructions converted between the two.
  18. ^ "R Installation and Administration". cran.r-project.org.
  19. ^ "The SQLite Bytecode Engine". Archived from the original on 2017年04月14日. Retrieved 2016年08月29日.

AltStyle によって変換されたページ (->オリジナル) /