Shortest undefined behavior sample in C++

8

2

What's the shortest, well-formed C++ code that exhibits undefined behavior?

Luchian Grigore

Posted 2012-08-22T14:49:59.293

Reputation: 197

Question was closed 2016-05-16T06:42:03.123

1What do you mean "runnable"? If it has UB, there's no guarantee it can be run. – R. Martinho Fernandes – 2012-08-22T14:59:20.447

@R.MartinhoFernandes well, I mean that it starts. – Luchian Grigore – 2012-08-22T15:00:06.923

@LuchianGrigore that is very dependent on the compiler (version). – rubenvb – 2012-08-22T15:04:00.673

1Define "exhibits". Does it need to show something? Or is it enough that the internal memory state is undefined at some point during the program? – Mr Lister – 2012-08-22T17:28:13.510

Answers

12

int main(){main;}

3.6.1 Main function [basic.start.main]

3 - [...] The function main shall not be used within a program.

Edit: this is diagnosable, so not UB.


int main(){for(;;);}

1.10 Multi-threaded executions and data races [intro.multithread]

24 - The implementation may assume that any thread will eventually do one of the following: — terminate, — make a call to a library I/O function, — access or modify a volatile object, or — perform a synchronization operation or an atomic operation.


int main(){int i=i;}

4.1 Lvalue-to-rvalue conversion [conv.lval]

1 - [...] If the object to which the glvalue refers is [...] uninitialized, a program that necessitates this conversion has undefined behavior.


//^L.

Here ^L is the form feed character, which is part of the basic character set. 4 characters (a newline is not required per 2.2:2). Undefined behaviour is per

2.8 Comments [lex.comment]

1 - If there is a form-feed or a vertical-tab character in [a //-style] comment, only white-space characters shall appear between it and the new-line that terminates the comment; no diagnostic is required.

ecatmur

Posted 2012-08-22T14:49:59.293

Reputation: 1 675

the second one's not undefined. It can optimize out the for loop without issue. That is not undefined behavior. – rubenvb – 2012-08-22T15:19:30.960

3

@rubenvb http://stackoverflow.com/questions/3592557/optimizing-away-a-while1-in-c0x - anytime the standard says "the compiler may assume P," it is implied that a program which has the property not-P has undefined semantics.

– ecatmur – 2012-08-22T15:28:43.740

yet the compiler may also assume no side effects when eliding copy constructors. – rubenvb – 2012-08-22T15:33:01.627

@rubenvb because that particular case is explicitly mentioned. – R. Martinho Fernandes – 2012-08-22T15:36:10.243

@rubenvb 12.8:31 says an implementation is allowed to omit the copy/move construction of a class object, even if the copy/move constructor and/or destructor for the object have side effects. "assume" is not in the language. – ecatmur – 2012-08-22T15:40:58.993

I believe the rule at §3.6.1 is diagnosable, so your first is ill-formed. – Jerry Coffin – 2012-08-22T15:43:17.830

@Jerry is right. §3.6.1 has no mention that no diagnostic is required or that the behaviour is undefined. That makes it indeed diagnosable. – R. Martinho Fernandes – 2012-08-22T15:46:07.777

the for(;;) is exactly the loop I use in my embedded project..so not at all undefined, but rather strong, optimized and very useful – Bogdan Alexandru – 2012-10-12T07:24:55.303

1@BogdanAlexandru your embedded project has undefined behaviour. That means that it might work as you expect now, but when you upgrade your compiler it will behave differently. – ecatmur – 2012-10-12T08:17:20.940

5

\u\
0000

This has eight characters, and has undefined behaviour, according to §2.2/1.

Each instance of a backslash character (\) immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines. Only the last backslash on any physical source line shall be eligible for being part of such a splice. If, as a result, a character sequence that matches the syntax of a universal-character-name is produced, the behavior is undefined.

R. Martinho Fernandes

Posted 2012-08-22T14:49:59.293

Reputation: 2 135

1@JerryCoffin He is using a freestanding environment where main is not required. You conveniently left out the rest of §3.6.1/1 which says "It is implementation-defined whether a program in a freestanding environment is required to define a main function.". This is a valid program ("valid" as in it's a "program" according to your argument). – David – 2015-01-09T10:04:30.133

Dang, just came here to bring that one. +1 – Columbo – 2015-04-24T22:27:54.080

this isn't & "program". A program has a main. – rubenvb – 2012-08-22T15:18:41.570

@rubenvb Try it on Hell++. It has undefined behaviour, so can be a program without a main. It's hard to enforce rules when you ask for a program that doesn't have any rules to follow (that's what UB means!). – R. Martinho Fernandes – 2012-08-22T15:19:01.373

3Based on the wording of the question, I think this qualifies -- it asks only for the shortest "code" that displays UB. Based on the "runnable" in the comment, I think the intent, however, was the shortest program, which would rule out your code, because according to §3.6.1/1: "A program shall contain a global function called main, which is the designated start of the program." As such, without main, what you have is ill-formed. – Jerry Coffin – 2012-08-22T16:13:26.187

2

#include. /*Imagine a new-line right after the dot*/

§16.2/4:

A preprocessing directive of the form

             #include  pp-tokens   new-line

(that does not match one of the two previous forms) is permitted. [..] If the directive resulting after all replacements does not match one of the two previous forms, the behavior is undefined.

Columbo

Posted 2012-08-22T14:49:59.293

Reputation: 121

Won't it parse the same if you remove the space before the dot? – feersum – 2015-06-05T21:27:16.043

@feersum Yep. That doesn't alter the idea, which is why I avoided it to enhance readability, but I guess it serves the point of the question to illustrate that the space can be omitted. Thanks! – Columbo – 2015-06-05T21:54:30.643

2

int main(){int i=1>>-1;}

Explanation:

C++98 and C++11 §5.8/1 both state that

The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand.

rubenvb

Posted 2012-08-22T14:49:59.293

Reputation: 224

3on a 32-bit system, isnt the length in bits of int typically 32? changing the RHS of the shift to -1 is the same number of chars and won't depend on integer sizes – ardnew – 2012-08-22T19:42:41.397

1

If one is to believe wikipedia, here's a few:

Modifying strings is said to cause undefined behavior. It's always worked for me.

int main(int c,char*v){v[0]='.';}

A non-void function with no return causes undefined return values.

int a(){}
int main(){return a();}

Division (of int?) by zero is supposedly undefined. All I know is that it crashes.

int main(int c){c/0;}

shiona

Posted 2012-08-22T14:49:59.293

Reputation: 2 889

These are not legal code snippets. Main must be defined as int main(){} at the very least. – rubenvb – 2012-08-22T15:14:41.233

Damn, I was sure the question was about C. Sorry about that. – shiona – 2012-08-22T18:13:55.243

1@shiona the second parameter should be char ** not char *. You're merely modifying part of a pointer value. In fact, argv is guaranteed to be safely modifiable. – oldrinb – 2012-09-09T15:12:35.223

The standard allows string literals to be read-only, so many implementations do that. It's correct for char *argv[] to not have any const qualifiers. You could crash by writing into a string literal with int main(){char *v="";*v=1;} – Peter Cordes – 2016-05-15T03:53:53.133

0

int main() { int* a; return a[0]; }

danbo

Posted 2012-08-22T14:49:59.293

Reputation: 11