fbpx
Wikipedia

Bytecode

Bytecode (also called portable code or p-code[citation needed]) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable[1] source code, bytecodes are compact numeric codes, constants, and references (normally numeric addresses) that encode the result of compiler parsing and performing semantic analysis of things like type, scope, and nesting depths of program objects.

The name bytecode stems from instruction sets that have one-byte opcodes followed by optional parameters. Intermediate representations such as bytecode may be output by programming language implementations to ease interpretation, or it may be used to reduce hardware and operating system dependence by allowing the same code to run cross-platform, on different devices. Bytecode may often be either directly executed on a virtual machine (a p-code machine, i.e., interpreter), or it may be further compiled into machine code for better performance.

Since bytecode instructions are processed by software, they may be arbitrarily complex, but are nonetheless often akin to traditional hardware instructions: virtual stack machines are the most common, but virtual register machines have been built also.[2][3] Different parts may often be stored in separate files, similar to object modules, but dynamically loaded during execution.

Execution

A bytecode program may be executed by parsing and directly executing the instructions, one at a time. This kind of bytecode interpreter is very portable. Some systems, called dynamic translators, or just-in-time (JIT) compilers, translate bytecode into machine code as necessary at runtime. This makes the virtual machine hardware-specific but does not lose the portability of the bytecode. For example, Java and Smalltalk code is typically stored in bytecode format, which is typically then JIT compiled to translate the bytecode to machine code before execution. This introduces a delay before a program is run, when the bytecode is compiled to native machine code, but improves execution speed considerably compared to interpreting source code directly, normally by around an order of magnitude (10x).[4]

Because of its performance advantage, today many language implementations execute a program in two phases, first compiling the source code into bytecode, and then passing the bytecode to the virtual machine. There are bytecode based virtual machines of this sort for Java, Raku, Python, PHP,[a] Tcl, mawk and Forth (however, Forth is seldom compiled via bytecodes in this way, and its virtual machine is more generic instead). The implementation of Perl and Ruby 1.8 instead work by walking an abstract syntax tree representation derived from the source code.

More recently, the authors of V8[1] and Dart[7] have challenged the notion that intermediate bytecode is needed for fast and efficient VM implementation. Both of these language implementations currently do direct JIT compiling from source code to machine code with no bytecode intermediary.[8]

Examples

(disassemble '(lambda (x) (print x))) ; disassembly for (LAMBDA (X)) ; 2436F6DF: 850500000F22 TEST EAX, [#x220F0000]  ; no-arg-parsing entry point ; E5: 8BD6 MOV EDX, ESI ; E7: 8B05A8F63624 MOV EAX, [#x2436F6A8]  ; #<FDEFINITION object for PRINT> ; ED: B904000000 MOV ECX, 4 ; F2: FF7504 PUSH DWORD PTR [EBP+4] ; F5: FF6005 JMP DWORD PTR [EAX+5] ; F8: CC0A BREAK 10  ; error trap ; FA: 02 BYTE #X02 ; FB: 18 BYTE #X18  ; INVALID-ARG-COUNT-ERROR ; FC: 4F BYTE #X4F  ; ECX 

Compiled code can be analysed and investigated using a built-in tool for debugging the low-level bytecode. The tool can be initialized from the shell, for example:

>>> import dis # "dis" - Disassembler of Python byte code into mnemonics. >>> dis.dis('print("Hello, World!")')  1 0 LOAD_NAME 0 (print)  2 LOAD_CONST 0 ('Hello, World!')  4 CALL_FUNCTION 1  6 RETURN_VALUE 

See also

Notes

  1. ^ PHP has just-in-time compilation in PHP 8,[5][6] and before while not on in the default version, had options like HHVM. For older versions of PHP: Although PHP opcodes are generated each time the program is launched, and are always interpreted and not just-in-time compiled.

References

  1. ^ a b "Dynamic Machine Code Generation". Google Inc.
  2. ^ "The Implementation of Lua 5.0". (NB. This involves a register-based virtual machine.)
  3. ^ . Archived from the original on 2013-05-18. Retrieved 2012-10-29. (NB. This VM is register based.)
  4. ^ "Byte Code Vs Machine Code". www.allaboutcomputing.net. Retrieved 2017-10-23.
  5. ^ O’Phinney, Matthew Weier. "Exploring the New PHP JIT Compiler". Zend by Perforce. Retrieved 2021-02-19.
  6. ^ "PHP 8: The JIT - stitcher.io". stitcher.io. Retrieved 2021-02-19.
  7. ^ Loitsch, Florian. . Google. Archived from the original on 2013-05-12.
  8. ^ "JavaScript myth: JavaScript needs a standard bytecode". 2ality.com.
  9. ^ G., Adam Y. (2022-07-11). "Berkeley Pascal". GitHub. Retrieved 2022-01-08.
  10. ^ "CLHS: Function DISASSEMBLE". www.lispworks.com.
  11. ^ "Performance Tuning and Tips". lispcookbook.github.io.
  12. ^ (PDF). Archived from the original (PDF) on 2016-03-05. Retrieved 2011-09-09.
  13. ^ "The Implementation of Icon and Unicon a Compendium" (PDF). Archived (PDF) from the original on 2022-10-09.
  14. ^ Paul, Matthias R. (2001-12-30). "KEYBOARD.SYS internal structure". Newsgroup: comp.os.msdos.programmer. Archived from the original on 2017-09-09. Retrieved 2016-09-17. […] In fact, the format is basically the same in MS-DOS 3.3 - 8.0, PC DOS 3.3 - 2000, including Russian, Lithuanian, Chinese and Japanese issues, as well as in Windows NT, 2000, and XP […]. There are minor differences and incompatibilities, but the general format has not changed over the years. […] Some of the data entries contain normal tables […] However, most entries contain executable code interpreted by some kind of p-code interpreter at *runtime*, including conditional branches and the like. This is why the KEYB driver has such a huge memory footprint compared to table-driven keyboard drivers which can be done in 3 - 4 Kb getting the same level of function except for the interpreter. […]
  15. ^ Mendelson, Edward (2001-07-20). "How to Display the Euro in MS-DOS and Windows DOS". Display the euro symbol in full-screen MS-DOS (including Windows 95 or Windows 98 full-screen DOS). from the original on 2016-09-17. Retrieved 2016-09-17. […] Matthias [R.] Paul […] warns that the IBM PC DOS version of the keyboard driver uses some internal procedures that are not recognized by the Microsoft driver, so, if possible, you should use the IBM versions of both KEYB.COM and KEYBOARD.SYS instead of mixing Microsoft and IBM versions […] (NB. What is meant by "procedures" here are some additional bytecodes in the IBM KEYBOARD.SYS file not supported by the Microsoft version of the KEYB driver.)
  16. ^ "United States Patent 6,973,644".
  17. ^ Microsoft C Pcode Specifications. p. 13. Multiplan wasn't compiled to machine code, but to a kind of byte-code which was run by an interpreter, in order to make Multiplan portable across the widely varying hardware of the time. This byte-code distinguished between the machine-specific floating point format to calculate on, and an external (standard) format, which was binary coded decimal (BCD). The PACK and UNPACK instructions converted between the two.
  18. ^ "R Installation and Administration". cran.r-project.org.
  19. ^ . Archived from the original on 2017-04-14. Retrieved 2016-08-29.

bytecode, portable, code, code, redirect, here, other, uses, software, portability, code, disambiguation, this, article, needs, additional, citations, verification, please, help, improve, this, article, adding, citations, reliable, sources, unsourced, material. Portable code and P code redirect here For other uses see software portability and P Code disambiguation This article needs additional citations for verification Please help improve this article by adding citations to reliable sources Unsourced material may be challenged and removed Find sources Bytecode news newspapers books scholar JSTOR January 2009 Learn how and when to remove this template message Bytecode also called portable code or p code citation needed is a form of instruction set designed for efficient execution by a software interpreter Unlike human readable 1 source code bytecodes are compact numeric codes constants and references normally numeric addresses that encode the result of compiler parsing and performing semantic analysis of things like type scope and nesting depths of program objects The name bytecode stems from instruction sets that have one byte opcodes followed by optional parameters Intermediate representations such as bytecode may be output by programming language implementations to ease interpretation or it may be used to reduce hardware and operating system dependence by allowing the same code to run cross platform on different devices Bytecode may often be either directly executed on a virtual machine a p code machine i e interpreter or it may be further compiled into machine code for better performance Since bytecode instructions are processed by software they may be arbitrarily complex but are nonetheless often akin to traditional hardware instructions virtual stack machines are the most common but virtual register machines have been built also 2 3 Different parts may often be stored in separate files similar to object modules but dynamically loaded during execution Contents 1 Execution 2 Examples 3 See also 4 Notes 5 ReferencesExecution EditA bytecode program may be executed by parsing and directly executing the instructions one at a time This kind of bytecode interpreter is very portable Some systems called dynamic translators or just in time JIT compilers translate bytecode into machine code as necessary at runtime This makes the virtual machine hardware specific but does not lose the portability of the bytecode For example Java and Smalltalk code is typically stored in bytecode format which is typically then JIT compiled to translate the bytecode to machine code before execution This introduces a delay before a program is run when the bytecode is compiled to native machine code but improves execution speed considerably compared to interpreting source code directly normally by around an order of magnitude 10x 4 Because of its performance advantage today many language implementations execute a program in two phases first compiling the source code into bytecode and then passing the bytecode to the virtual machine There are bytecode based virtual machines of this sort for Java Raku Python PHP a Tcl mawk and Forth however Forth is seldom compiled via bytecodes in this way and its virtual machine is more generic instead The implementation of Perl and Ruby 1 8 instead work by walking an abstract syntax tree representation derived from the source code More recently the authors of V8 1 and Dart 7 have challenged the notion that intermediate bytecode is needed for fast and efficient VM implementation Both of these language implementations currently do direct JIT compiling from source code to machine code with no bytecode intermediary 8 Examples EditActionScript executes in the ActionScript Virtual Machine AVM which is part of Flash Player and AIR ActionScript code is typically transformed into bytecode format by a compiler Examples of compilers include one built into Adobe Flash Professional and one built into Adobe Flash Builder and available in the Adobe Flex SDK Adobe Flash objects BANCStar originally bytecode for an interface building tool but used also as a language Berkeley Packet Filter Berkeley Pascal 9 Byte Code Engineering Library C to Java virtual machine compilers CLISP implementation of Common Lisp used to compile only to bytecode for many years however now it also supports compiling to native code with the help of GNU lightning CMUCL and Scieneer Common Lisp implementations of Common Lisp can compile either to native code or to bytecode which is far more compact Common Intermediate Language executed by Common Language Runtime used by NET languages such as C Dalvik bytecode designed for the Android platform is executed by the Dalvik virtual machine Dis bytecode designed for the Inferno operating system is executed by the Dis virtual machine EiffelStudio for the Eiffel programming language EM the Amsterdam Compiler Kit virtual machine used as an intermediate compiling language and as a modern bytecode language Emacs is a text editor with most of its functions implemented by Emacs Lisp its built in dialect of Lisp These features are compiled into bytecode This architecture allows users to customize the editor with a high level language which after compiling into bytecode yields reasonable performance Embeddable Common Lisp implementation of Common Lisp can compile to bytecode or C code Common Lisp provides a disassemble function 10 which prints to the standard output the underlying code of a specified function The result is implementation dependent and may or may not resolve to bytecode Its inspection can be utilized for debugging and optimization purposes 11 Steel Bank Common Lisp for instance produces disassemble lambda x print x disassembly for LAMBDA X 2436F6DF 850500000F22 TEST EAX x220F0000 no arg parsing entry point E5 8BD6 MOV EDX ESI E7 8B05A8F63624 MOV EAX x2436F6A8 lt FDEFINITION object for PRINT gt ED B904000000 MOV ECX 4 F2 FF7504 PUSH DWORD PTR EBP 4 F5 FF6005 JMP DWORD PTR EAX 5 F8 CC0A BREAK 10 error trap FA 02 BYTE X02 FB 18 BYTE X18 INVALID ARG COUNT ERROR FC 4F BYTE X4F ECX Ericsson implementation of Erlang uses BEAM bytecodes Ethereum s Virtual Machine EVM is the runtime environment using its own bytecode for transaction execution in Ethereum smart contracts Icon 12 and Unicon 13 programming languages Infocom used the Z machine to make its software applications more portable Java bytecode which is executed by the Java virtual machine ASM BCEL Javassist Keiko bytecode used by the Oberon 2 programming language to make it and the Oberon operating system more portable KEYB the MS DOS PC DOS keyboard driver with its resource file KEYBOARD SYS containing layout information and short p code sequences executed by an interpreter inside the resident driver 14 15 LLVM IR LSL a scripting language used in virtual worlds compiles into bytecode running on a virtual machine Second Life has the original Mono version Inworldz developed the Phlox version Lua language uses a register based bytecode virtual machine m code of the MATLAB language 16 Malbolge is an esoteric machine language for a ternary virtual machine Microsoft P code used in Visual C and Visual Basic Multiplan 17 O code of the BCPL programming language OCaml language optionally compiles to a compact bytecode form p code of UCSD Pascal implementation of the Pascal language Parrot virtual machine Pick BASIC also referred to as Data BASIC or MultiValue BASIC The R environment for statistical computing offers a bytecode compiler through the compiler package now standard with R version 2 13 0 It is possible to compile this version of R so that the base and recommended packages exploit this 18 Pyramid 2000 adventure game Python scripts are being compiled on execution to Python s bytecode language and the compiled files pyc are cached inside the script s folderCompiled code can be analysed and investigated using a built in tool for debugging the low level bytecode The tool can be initialized from the shell for example gt gt gt import dis dis Disassembler of Python byte code into mnemonics gt gt gt dis dis print Hello World 1 0 LOAD NAME 0 print 2 LOAD CONST 0 Hello World 4 CALL FUNCTION 1 6 RETURN VALUE Scheme 48 implementation of Scheme using bytecode interpreter Bytecodes of many implementations of the Smalltalk language The Spin interpreter built into the Parallax Propeller microcontroller The SQLite database engine translates SQL statements into a bespoke byte code format 19 Apple SWEET16 Tcl TIMI is used by compilers on the IBM i platform Tiny BASIC Visual FoxPro compiles to bytecode WebAssembly YARV and Rubinius for Ruby ZCODESee also Edit Look up bytecode in Wiktionary the free dictionary Intermediate representation Platform computing Runtime systemNotes Edit PHP has just in time compilation in PHP 8 5 6 and before while not on in the default version had options like HHVM For older versions of PHP Although PHP opcodes are generated each time the program is launched and are always interpreted and not just in time compiled References Edit a b Dynamic Machine Code Generation Google Inc The Implementation of Lua 5 0 NB This involves a register based virtual machine Dalvik VM Archived from the original on 2013 05 18 Retrieved 2012 10 29 NB This VM is register based Byte Code Vs Machine Code www allaboutcomputing net Retrieved 2017 10 23 O Phinney Matthew Weier Exploring the New PHP JIT Compiler Zend by Perforce Retrieved 2021 02 19 PHP 8 The JIT stitcher io stitcher io Retrieved 2021 02 19 Loitsch Florian Why Not a Bytecode VM Google Archived from the original on 2013 05 12 JavaScript myth JavaScript needs a standard bytecode 2ality com G Adam Y 2022 07 11 Berkeley Pascal GitHub Retrieved 2022 01 08 CLHS Function DISASSEMBLE www lispworks com Performance Tuning and Tips lispcookbook github io The Implementation of the Icon Programming Language PDF Archived from the original PDF on 2016 03 05 Retrieved 2011 09 09 The Implementation of Icon and Unicon a Compendium PDF Archived PDF from the original on 2022 10 09 Paul Matthias R 2001 12 30 KEYBOARD SYS internal structure Newsgroup comp os msdos programmer Archived from the original on 2017 09 09 Retrieved 2016 09 17 In fact the format is basically the same in MS DOS 3 3 8 0 PC DOS 3 3 2000 including Russian Lithuanian Chinese and Japanese issues as well as in Windows NT 2000 and XP There are minor differences and incompatibilities but the general format has not changed over the years Some of the data entries contain normal tables However most entries contain executable code interpreted by some kind of p code interpreter at runtime including conditional branches and the like This is why the KEYB driver has such a huge memory footprint compared to table driven keyboard drivers which can be done in 3 4 Kb getting the same level of function except for the interpreter Mendelson Edward 2001 07 20 How to Display the Euro in MS DOS and Windows DOS Display the euro symbol in full screen MS DOS including Windows 95 or Windows 98 full screen DOS Archived from the original on 2016 09 17 Retrieved 2016 09 17 Matthias R Paul warns that the IBM PC DOS version of the keyboard driver uses some internal procedures that are not recognized by the Microsoft driver so if possible you should use the IBM versions of both KEYB COM and KEYBOARD SYS instead of mixing Microsoft and IBM versions NB What is meant by procedures here are some additional bytecodes in the IBM KEYBOARD SYS file not supported by the Microsoft version of the KEYB driver United States Patent 6 973 644 Microsoft C Pcode Specifications p 13 Multiplan wasn t compiled to machine code but to a kind of byte code which was run by an interpreter in order to make Multiplan portable across the widely varying hardware of the time This byte code distinguished between the machine specific floating point format to calculate on and an external standard format which was binary coded decimal BCD The PACK and UNPACK instructions converted between the two R Installation and Administration cran r project org The SQLite Bytecode Engine Archived from the original on 2017 04 14 Retrieved 2016 08 29 Retrieved from https en wikipedia org w index php title Bytecode amp oldid 1124581452, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.