Comparison of programming languages (syntax)

Expressions

Programming language expressions can be broadly classified into four syntax structures:

prefix notation
  • Lisp (* (+ 2 3) (expt 4 5))
infix notation
suffix, postfix, or Reverse Polish notation
math-like notation
  • TUTOR (2 + 3)(45) $$ note implicit multiply operator

Statements

Programming language statements typically have conventions for:

  • statement separators;
  • statement terminators; and
  • line continuation

A statement separator is used to demarcate boundaries between two separate statements. A statement terminator is used to demarcate the end of an individual statement. Languages that interpret the end of line to be the end of a statement are called "line-oriented" languages.

"Line continuation" is a convention in line-oriented languages where the newline character could potentially be misinterpreted as a statement terminator. In such languages, it allows a single statement to span more than just one line.

Language Statement separator-terminator Secondary separator[1]
ABAP period separated
Ada semicolon terminated
ALGOL semicolon separated
ALGOL 68 semicolon and comma separated[2]
APL newline terminated separated
AppleScript newline terminated
AutoHotkey newline terminated
BASIC newline terminated colon separated
Boo newline terminated
C semicolon terminates statements comma separates expressions
C++ semicolon terminates statements comma separates expressions
C# semicolon terminated
COBOL whitespace separated, sometimes period separated, optionally separated with commas and semi-colons.
Cobra newline terminated
CoffeeScript newline terminated
CSS semicolon separated
D semicolon terminated
Eiffel newline terminated semicolon
Erlang colon separated, period terminated
F# newline terminated semicolon
Fortran newline terminated semicolon
Forth semicolons terminate word definitions
GFA BASIC newline terminated
Go semicolon separated (inserted by compiler)
Haskell (in do-notation) newline separated
Haskell (in do-notation, when braces are used) semicolon separated
Java semicolon terminated
JavaScript semicolon separated (but sometimes implicitly inserted on newlines)
Kotlin semicolon separated (but sometimes implicitly inserted on newlines)
Lua whitespace separated (semicolon optional)
Mathematica semicolon separated
MATLAB newline terminated semicolon or comma[3]
Object Pascal (Delphi) semicolon separated
Objective-C semicolon terminated
OCaml semicolon separated
Pascal semicolon separated
Perl semicolon separated
PHP semicolon terminated
Pick Basic newline terminated semicolon separated
PowerShell newline terminated semicolon separated
Prolog comma separated (conjunction), semicolon separated (disjunction), period terminated (clause)
Python newline terminated semicolon
Raku semicolon separated
Red whitespace separated
Ruby newline terminated semicolon
Rust semicolon terminates statements comma separates expressions
Scala newline terminated (semicolon optional) semicolon
Seed7 semicolon separated (semicolon termination is allowed)
Simula semicolon separated
S-Lang semicolon separated
Smalltalk period separated
Standard ML semicolon separated
Swift semicolon separated (inserted by compiler)
Visual Basic newline terminated colon separated
Visual Basic .NET newline terminated colon separated
Wolfram Language semicolon separated
Xojo newline terminated
Language Statement separator-terminator Secondary separator[1]

Line continuation

Line continuation is generally done as part of lexical analysis: a newline normally results in a token being added to the token stream, unless line continuation is detected.

Whitespace – Languages that do not need continuations
  • Ada – Lines terminate with semicolon
  • C# – Lines terminate with semicolon
  • JavaScript – Lines terminate with semicolon (which may be inferred)
  • Lua
  • OCaml
Ampersand as last character of line
  • Fortran 90, Fortran 95, Fortran 2003, Fortran 2008
Backslash as last character of line
Backtick as last character of line
Hyphen as last character of line
Underscore as last character of line
Ellipsis (as three periods–not one special character)
  • MATLAB: The ellipsis token need not be the last characters on the line, but any following it will be ignored.[6] (In essence, it begins a comment that extends through (i.e. including) the first subsequent newline character. Contrast this with an inline comment, which extends until the first subsequent newline.)
Comma delimiter as last character of line
  • Ruby (comment may follow delimiter)
Left bracket delimiter as last character of line
Operator as last object of line
  • Ruby (comment may follow operator)
Operator as first character of continued line
  • AutoHotkey: Any expression operators except ++ and --, as well as a comma or a period[8]
Backslash as first character of continued line
Some form of inline comment serves as line continuation
Character position
  • Fortran 77: A non-comment line is a continuation of the previous non-comment line if any non-space character appears in column 6. Comment lines cannot be continued.
  • COBOL: String constants may be continued by not ending the original string in a PICTURE clause with ', then inserting a - in column 7 (same position as the * for comment is used.)
  • TUTOR: Lines starting with a tab (after any indentation required by the context) continue the previous command.
[End and Begin] using normal quotes
  • C and C++ preprocessor: The string is ended normally and continues by starting with a quote on the next line.

Libraries

To import a library is a way to read external, possibly compiled, routines, programs or packages. Imports can be classified by level (module, package, class, procedure,...) and by syntax (directive name, attributes,...)

File import
  • ASP: #include file="filename"
  • AutoHotkey, AutoIt, C, C++: #include "filename", #include <filename>
  • COBOL: COPY filename.
  • Falcon: load "filename"
  • Fortran: include 'filename'
  • Lua: require("filename")
  • Mathematica and Wolfram Language: Import["filename"]
  • MATLAB: addpath(directory)[9]
  • Objective-C: #import "filename", #import <filename>
  • Perl: require "filename";
  • PHP: include "filename";, require "filename";
  • Prolog: :-include("filename").
  • Pick Basic: include [filename] program, #include [filename] program
  • R: source(""filename"")
  • Red: load %filename
  • Rust: include!( "filename");
Package import
  • Ada: with package
  • C, C++: #include filename
  • Cobra: use Package.Name
  • D: import package.module;, import altname = package.module;
  • Falcon: load module, load module.submodule
  • Fortran 90+: use module, use module, only : identifier
  • Go: import altname "package/name"
  • Haskell: import Module, import qualified Module as M
  • Java, MATLAB, kotlin: import package.*
  • JavaScript: import altname from "modname";, import "modname";
  • Lua: require("modname")
  • Mathematica and Wolfram Language: <<name
  • Oberon: IMPORT module
  • Objective-C: @import module;
  • Pascal: uses unit
  • Perl: use Module;, use Module qw(import options);
  • Prolog: :-use_module(module).
  • Python: import module, from module import *
  • Rust: mod modname;, #[path = "filename"] mod altname;, extern crate libname;, extern crate libname as altname;
  • R: library("package")
  • Scala: import package._, import package
  • Swift: import module
Class import
  • Falcon: import class
  • Java, MATLAB, kotlin: import package.class
  • JavaScript: import class from "modname";, import {class} from "modname";, import {class as altname} from "modname";
  • PHP: use Namespace\ClassName;, use Namespace\ClassName as AliasName;
  • Python: from module import class
  • Scala: import package.class, import package.{ class1 => alternativeName, class2 }, import package._
Procedure/function import
  • D: import package.module : symbol;, import package.module : altsymbolname = symbol;
  • Haskell: import Module (function)
  • JavaScript: import function from "modname";, import {function} from "modname";, import {function as altname} from "modname";
  • MATLAB: import package.function
  • Perl: use Module ('symbol');
  • PHP: use function Namespace\function_name;, use Namespace\function_name as function_alias_name;
  • Python: from module import function
  • Rust: use module::submodule::symbol;, use module::submodule::{symbol1, symbol2};, use module::submodule::symbol as altname;
  • Scala: import package.class.function, import package.class.{ function => alternativeName, otherFunction }
Constant import
  • PHP: use const Namespace\CONST_NAME;

The above statements can also be classified by whether they are a syntactic convenience (allowing things to be referred to by a shorter name, but they can still be referred to by some fully qualified name without import), or whether they are actually required to access the code (without which it is impossible to access the code, even with fully qualified names).

Syntactic convenience
  • Java: import package.*, import package.class
  • OCaml: open module
Required to access code
  • Go: import altname "package/name"
  • JavaScript: import altname from "modname";
  • Python: import module

Blocks

A block is a notation for a group of two or more statements, expressions or other units of code that are related in such a way as to comprise a whole.

Braces (a.k.a. curly brackets) { ... }
Parentheses ( ... )
Square brackets [ ... ]
  • Smalltalk (blocks are first class objects. a.k.a. closures)
begin ... end
do ... end
do ... done
do ... end
  • Lua, Ruby (pass blocks as arguments, for loop), Seed7 (encloses loop bodies between do and end)
X ... end (e.g. if ... end):
  • Ruby (if, while, until, def, class, module statements), OCaml (for & while loops), MATLAB (if & switch conditionals, for & while loops, try clause, package, classdef, properties, methods, events, & function blocks), Lua (then / else & function)
(begin ...)
(progn ...)
(do ...)
Indentation
Others
  • Ada, Visual Basic, Seed7: if ... end if
  • APL: :If ... :EndIf or :If ... :End
  • Bash, sh, and ksh: if ... fi, do ... done, case ... esac;
  • ALGOL 68: begin ... end, ( ... ), if ... fi, do ... od
  • Lua, Pascal, Modula-2, Seed7: repeat ... until
  • COBOL: IF ... END-IF, PERFORM ... END-PERFORM, etc. for statements; ... . for sentences.
  • Visual Basic .Net: If ... End If, For ... Next, Do ... Loop
  • Small Basic: If ... EndIf, For ... EndFor, While ... EndWhile

Comments

Comments can be classified by:

  • style (inline/block)
  • parse rules (ignored/interpolated/stored in memory)
  • recursivity (nestable/non-nestable)
  • uses (docstrings/throwaway comments/other)

Inline comments

Inline comments are generally those that use a newline character to indicate the end of a comment, and an arbitrary delimiter or sequence of tokens to indicate the beginning of a comment.

Examples:

Symbol Languages
C Fortran I to Fortran 77 (C in column 1)
REM BASIC, Batch files
:: Batch files, COMMAND.COM, cmd.exe
NB. J; from the (historically) common abbreviation Nota bene, the Latin for "note well".
APL; the mnemonic is the glyph (jot overstruck with shoe-down) resembles a desk lamp, and hence "illuminates" the foregoing.
# Bourne shell and other UNIX shells, Cobra, Perl, Python, Ruby, Seed7, Windows PowerShell, PHP, R, Make, Maple, Elixir, Nim[10]
% TeX, Prolog, MATLAB,[11] Erlang, S-Lang, Visual Prolog
// ActionScript, C (C99), C++, C#, D, F#, Go, Java, JavaScript, Kotlin, Object Pascal (Delphi), Objective-C, PHP, Rust, Scala, SASS, Swift, Xojo
' Monkey, Visual Basic, VBScript, Small Basic, Gambas, Xojo
! Fortran, Basic Plus, Inform, Pick Basic
; Assembly x86, AutoHotkey, AutoIt, Lisp, Common Lisp, Clojure, Rebol, Red, Scheme
-- Euphoria, Haskell, SQL, Ada, AppleScript, Eiffel, Lua, VHDL, SGML, PureScript
* Assembler S/360 (* in column 1), COBOL I to COBOL 85, PAW, Fortran IV to Fortran 77 (* in column 1), Pick Basic
|| Curl
" Vimscript, ABAP
\ Forth
*> COBOL 90

Block comments

Block comments are generally those that use a delimiter to indicate the beginning of a comment, and another delimiter to indicate the end of a comment. In this context, whitespace and newline characters are not counted as delimiters.

Examples:

Symbol Languages
comment ~ ; ALGOL 60, SIMULA
¢ ~ ¢,
# ~ #, co ~ co,
comment ~ comment
ALGOL 68[12][13]
/* ~ */ ActionScript, AutoHotkey, C, C++, C#, D,[14] Go, Java, JavaScript, kotlin, Objective-C, PHP, PL/I, Prolog, Rexx, Rust (can be nested), Scala (can be nested), SAS, SASS, SQL, Swift (can be nested), Visual Prolog, CSS
#cs ~ #ce AutoIt[15]
/+ ~ +/ D (can be nested)[14]
/# ~ #/ Cobra (can be nested)
<# ~ #> Powershell
<!-- ~ --> HTML, XML
=begin ~ =cut Perl
#`( ~ ) Raku (bracketing characters can be (), <>, {}, [], any Unicode characters with BiDi mirrorings, or Unicode characters with Ps/Pe/Pi/Pf properties)
=begin ~ =end Ruby
#<TAG> ~ #</TAG>, #stop ~ EOF,
#iffalse ~ #endif, #ifntrue ~ #endif,
#if false ~ #endif, #if !true ~ #endif
S-Lang[16]
{- ~ -} Haskell (can be nested)
(* ~ *) Delphi, ML, Mathematica, Object Pascal, Pascal, Seed7, Applescript, OCaml (can be nested), Standard ML (can be nested), Maple, Newspeak, F#
{ ~ } Delphi, Object Pascal, Pascal, Red
{# ~ #} Nunjucks, Twig
{{! ~ }} Mustache, Handlebars
{{!-- ~ --}} Handlebars (cannot be nested, but may contain {{ and }})
|# ~ #| Curl
%{ ~ %} MATLAB[11] (the symbols must be in a separate line)
#| ~ |# Lisp, Scheme, Racket (can be nested in all three).
#[ ~ ]# Nim[17]
--[[</code> ~ <code>]],
--[=[ ~ ]=],
--[=...=[ ~ ]=...=]
Lua (brackets can have any number of matching = characters; can be nested within non-matching delimiters)
" ~ " Smalltalk
(comment ~ ) Clojure

Unique variants

Fortran
  • Indenting lines in Fortran 66/77 is significant. The actual statement is in columns 7 through 72 of a line. Any non-space character in column 6 indicates that this line is a continuation of the previous line. A 'C' in column 1 indicates that this entire line is a comment. Columns 1 though 5 may contain a number which serves as a label. Columns 73 though 80 are ignored and may be used for comments; in the days of punched cards, these columns often contained a sequence number so that the deck of cards could be sorted into the correct order if someone accidentally dropped the cards. Fortran 90 removed the need for the indentation rule and added inline comments, using the ! character as the comment delimiter.
COBOL
  • In fixed format code, line indentation is significant. Columns 1–6 and columns from 73 onwards are ignored. If a * or / is in column 7, then that line is a comment. Until COBOL 2002, if a D or d was in column 7, it would define a "debugging line" which would be ignored unless the compiler was instructed to compile it.
Cobra
  • Cobra supports block comments with "/# ... #/" which is like the "/* ... */" often found in C-based languages, but with two differences. The # character is reused from the single-line comment form "# ...", and the block comments can be nested which is convenient for commenting out large blocks of code.
Curl
  • Curl supports block comments with user-defined tags as in |foo# ... #foo|.
Lua
  • Like raw strings, there can be any number of equals signs between the square brackets, provided both the opening and closing tags have a matching number of equals signs; this allows nesting as long as nested block comments/raw strings use a different number of equals signs than their enclosing comment: --[[comment --[=[ nested comment ]=] ]]. Lua discards the first newline (if present) that directly follows the opening tag.
Perl
  • Block comments in Perl are considered part of the documentation, and are given the name Plain Old Documentation (POD). Technically, Perl does not have a convention for including block comments in source code, but POD is routinely used as a workaround.
PHP
  • PHP supports standard C/C++ style comments, but supports Perl style as well.
Python
  • The use of the triple-quotes to comment-out lines of source, does not actually form a comment.[18] The enclosed text becomes a string literal, which Python usually ignores (except when it is the first statement in the body of a module, class or function; see docstring).
Raku
  • Raku uses #`(...) to denote block comments.[19] Raku actually allows the use of any "right" and "left" paired brackets after #` (i.e. #`(...), #`[...], #`{...}, #`<...>, and even the more complicated #`{{...}} are all valid block comments). Brackets are also allowed to be nested inside comments (i.e. #`{ a { b } c } goes to the last closing brace).
Ruby
  • Block comment in Ruby opens at =begin line and closes at =end line.
S-Lang
  • The region of lines enclosed by the #<tag> and #</tag> delimiters are ignored by the interpreter. The tag name can be any sequence of alphanumeric characters that may be used to indicate how the enclosed block is to be deciphered. For example, #<latex> could indicate the start of a block of LaTeX formatted documentation.
Scheme and Racket
  • The next complete syntactic component (s-expression) can be commented out with #; .
ABAP

ABAP supports two different kinds of comments. If the first character of a line, including indentation, is an asterisk (*) the whole line is considered as a comment, while a single double quote (") begins an in-line commet which acts until the end of the line. ABAP comments are not possible between the statements EXEC SQL and ENDEXEC because Native SQL has other usages for these characters. In the most SQL dialects the double dash (--) can be used instead.

Esoteric languages

Comment comparison

There is a wide variety of syntax styles for declaring comments in source code. BlockComment in italics is used here to indicate block comment style. InlineComment in italics is used here to indicate inline comment style.

Language In-line comment Block comment
Ada, Eiffel, Euphoria, Occam, SPARK, ANSI SQL, and VHDL -- InlineComment
ALGOL 60 comment BlockComment;
ALGOL 68 ¢ BlockComment ¢

comment BlockComment comment
co BlockComment co
# BlockComment #
£ BlockComment £

APL InlineComment
AppleScript -- InlineComment (* BlockComment *)
Assembly language (varies) ; InlineComment   one example (most assembly languages use line comments only)
AutoHotkey ; InlineComment /* BlockComment */
AWK, Bash, Bourne shell, C shell, Maple, R, Tcl, and Windows PowerShell # InlineComment <# BlockComment #>
BASIC (various dialects): 'InlineComment (not all dialects)

REM InlineComment

C (K&R, ANSI/C89/C90), CHILL, PL/I, and REXX /* BlockComment */
C (C99), C++, Go, Swift and JavaScript // InlineComment /* BlockComment */
C# // InlineComment
/// InlineComment (XML documentation comment)
/* BlockComment */
/** BlockComment */ (XML documentation comment)
COBOL I to COBOL 85 * InlineComment (* in column 7)
COBOL 2002 *> InlineComment
Curl || InlineComment |# BlockComment #|

|foo# BlockComment #|

Cobra # InlineComment /# BlockComment #/ (nestable)
D // InlineComment
/// Documentation InlineComment (ddoc comments)
/* BlockComment */
/** Documentation BlockComment */ (ddoc comments)

/+ BlockComment +/ (nestable)
/++ Documentation BlockComment +/ (nestable, ddoc comments)

DCL $! InlineComment
ECMAScript (JavaScript, ActionScript, etc.) // InlineComment /* BlockComment */
Forth \ InlineComment ( BlockComment ) (single line as well as multiline)

( before -- after ) stack comment convention

FORTRAN I to FORTRAN 77 C InlineComment (C in column 1)
Fortran 90 ! InlineComment
Haskell -- InlineComment {- BlockComment -}
Java // InlineComment /* BlockComment */

/** BlockComment */ (Javadoc documentation comment)

Lisp and Scheme ; InlineComment #| BlockComment |#
Lua -- InlineComment --[==[ BlockComment]==] (variable number of = signs)
Maple # InlineComment (* BlockComment *)
Mathematica (* BlockComment *)
Matlab % InlineComment %{
BlockComment (nestable)
%}

Note: Both percent–bracket symbols must be the only non-whitespace characters on their respective lines.
Nim # InlineComment #[ BlockComment ]#
Object Pascal (Delphi) // InlineComment (* BlockComment *)
{ BlockComment }
OCaml (* BlockComment (* nestable *) *)
Pascal, Modula-2, Modula-3, Oberon, and ML: (* BlockComment *)
Perl and Ruby # InlineComment =begin
BlockComment
=cut
(=end in Ruby) (POD documentation comment)

__END__
Comments after end of code

PHP # InlineComment
// InlineComment
/* BlockComment */
/** Documentation BlockComment */ (PHP Doc comments)
PILOT R:InlineComment
PLZ/SYS ! BlockComment !
PL/SQL and TSQL -- InlineComment /* BlockComment */
Prolog % InlineComment /* BlockComment */
Python # InlineComment ''' BlockComment '''
""" BlockComment """

(Documentation string when first line of module, class, method, or function)

Raku # InlineComment #`{
BlockComment
}

=comment
    This comment paragraph goes until the next POD directive
    or the first blank line.
[20][21]

Red ; InlineComment { BlockComment }
Rust // InlineComment

/// InlineComment ("Outer" rustdoc comment)
//! InlineComment ("Inner" rustdoc comment)

/* BlockComment */ (nestable)

/** BlockComment */ ("Outer" rustdoc comment)
/*! BlockComment */ ("Inner" rustdoc comment)

SAS * BlockComment;
/* BlockComment */
Seed7 # InlineComment (* BlockComment *)
Simula comment BlockComment;
! BlockComment;
Smalltalk "BlockComment"
Smarty {* BlockComment *}
Standard ML (* BlockComment *)
TeX, LaTeX, PostScript, Erlang, Elixir and S-Lang % InlineComment
Texinfo @c InlineComment

@comment InlineComment

TUTOR * InlineComment
command $ InlineComment
Visual Basic ' InlineComment
Rem InlineComment
Visual Basic .NET ' InlineComment

''' InlineComment (XML documentation comment)
Rem InlineComment

Visual Prolog % InlineComment /* BlockComment */
Wolfram Language (* BlockComment *)
Xojo ' InlineComment
// InlineComment
rem InlineComment
gollark: As much as I mostly dislike golang, I think they got it right.
gollark: What if you want to use fallthrough in a *bit* of a mostly non-fallthrough switch/case?
gollark: (also, how would you specify fallthrough if you wanted that?)
gollark: Well, you could possibly, but it would likely be awful and not type safe.
gollark: The issue isn't the extra typing, it's that it's easy to forget and introduce weirdness (and also the extra line is ugly).

See also

References

  1. For multiple statements on one line
  2. Three different kinds of clauses, each separates phrases and the units differently:
      1. serial-clause using go-on-token (viz. semicolon): begin a; b; c end – units are executed in order.
      2. collateral-clause using and-also-token (viz. ","): begin a, b, c end – order of execution is to be optimised by the compiler.
      3. parallel-clause using and-also-token (viz. ","): par begin a, b, c end – units must be run in parallel threads.
  3. semicolon – result of receding statement hidden, comma – result displayed
  4. Bash Reference Manual, 3.1.2.1 Escape Character
  5. Python Documentation, 2. Lexical analysis: 2.1.5. Explicit line joining
  6. Mathworks.com Archived 7 February 2010 at the Wayback Machine
  7. https://ss64.com/nt/syntax-brackets.html
  8. https://autohotkey.com/docs/Scripts.htm#continuation
  9. For an M-file (MATLAB source) to be accessible by name, its parent directory must be in the search path (or current directory).
  10. https://nim-lang.org/docs/manual.html#lexical-analysis-comments
  11. "Mathworks.com". Archived from the original on 21 November 2013. Retrieved 25 June 2013.
  12. "Algol68_revised_report-AB.pdf on PDF pp. 61–62, original document pp. 121–122" (PDF). Retrieved 27 May 2014.
  13. "HTML Version of the Algol68 Revised Report AB". Archived from the original on 17 March 2013. Retrieved 27 May 2014.
  14. "DLang.org, Lexical". Retrieved 27 May 2014.
  15. "AutoItScript.com Keyword Reference, #comments-start". Retrieved 27 May 2014.
  16. "slang-2.2.4/src/slprepr.c – line 43 to 113". Retrieved 28 May 2014.
  17. "Nim Manual".
  18. "Python tip: You can use multi-line strings as multi-line comments", 11 September 2011, Guido van Rossum
  19. "Perl 6 Documentation (Syntax)". docs.perl6.org. Comments. Retrieved 5 April 2017.
  20. "Perl 6 POD Comments".
  21. "Perl 6 POD (Abbreviated Blocks)".
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.