C++: Why should I breakup my expressions?

One Line Expressions

There has been and I'm sure there still are many talks about whether to write it all on one line or break up expressions on many lines.

One Line Pros

The only pro I can think of for having everything on the same line is conciseness. For example:

*dst++ = *src++;

This is a very clear statement for most every C/C++ programmer.

I have to agree with this fact and I at times use such lines of code too...

One Line Cons

Now, I write very large libraries and programs (thousands when not ten of thousands or even hundreds of thousand lines of code....) and it can be tedious to break down such lines of code in many expressions.

If you'd like to see one of these programs, I have Snap! C++ which is some 500,000 lines of mainly my code (the other 500,000 is from STL and other libraries. See Snap! C++ Coverity for a count.)

The line above can be broken up into four lines:

int const v = *src;
*dst = v;
++src;
++dst;

Why is that generally better?

There are three main reasons why this code gives you a much easier way to debug your code.

Invalid Pointers

Whenever a pointer is invalid, your software crashes with a SEGV or BUS error.

If you have a single line like in our first example, you will have no idea whether the input or output pointer was wrong.

When you break up the expressions, however, in debug mode it will tell you whether it was while reading the source or writing the destination because the line numbers are different.

This is a very simple example, but trust me, I've seen expressions with over 10 pointers used in unison. When a crash occurs, go figure which of those 10 pointers caused the error...

Note that many will tell me that they can very quickly see which pointer is wrong in their debugger. True. Only my clients do not run my code in my debugger... yet I can have a crash report that tells me the error occurred on line 5. I look at line 5 and I have a single pointer. I know which pointer is wrong. Bing!

Speed Optimization

Something many programmers tend to forget in C++ is that the pre- and post-increment and -decrement do not work the same way.

The pre-increment directly increments your variable and then it returns that new value.

The post-increment first makes a copy, then it increments your variable, and finally, it returns the copy.

With good optimization, that extra copy does not happen when you are handling bare pointers (like in C). However, when src or dst are complex C++ objects, you bet that this copy is going to happen every single time.

Also, many programmers introduced bugs in their code by trying to increment and/or decrement their pointer within an expression. It's like calling a function with side effects within an expression. It's just not a good idea (actually, the Ada language prevents you from doing so. There must be good reasons for that, hey?)

So as a result, having the increment/decrement happen as a separate statement is an excellent idea. If this helps you, this is similar to writing:

a += 1;
   or
a -= 1;

Here we clearly see an assignment. You should always treat assignments as statements. Another one great feature of C/C++ which should not be used within expressions.

Something like this is just not a good idea.

b = (a += 3) * 7;

Compilers today can do all the necessary optimizations so you do not need to try to do that for them. Actually, it is very likely that the compiler will do a much better job than you. So write this instead:

a += 3;
b = a * 7;

It makes it a lot clearer to the reader (the poor guy who has to debug your code.)

Was it Tested?

Another reason for having separate lines of code is to make sure that everything gets tested by your unit tests.

Many expressions make use of the ?: operator. This means that only one part of the expression is going to be evaluated. The other part will be ignored.

If you write everything on a single line like so:

return s < 0 ? v >> -s : v << s;

Testing with s >= 0 or s < 0 will mark the entire line as covered when in truth, both values have to be used to test that line 100%.

In most cases, your coverage test will correctly show you your results if you write:

return s < 0
        ? v >> -s
        : v << s;

Now that this is three distinct lines gives the coverage feature of g++ a way to show whether the v >> -s or the v << s was used. Your coverage will be 100% complete only if both lines get it.

If you are not very heavy on Coverage testing like I'm you probably will laugh at this one. But if you are working on the code for a rocket or a self-driven car, I hope you'll take it into consideration.

Another optimization that people do:

if(condition) return;

So if a condition is true, return immediately. I see two problems there.

First, you should always write your list of statement between curly brackets. Why? Because when you (or someone else) adds another statement, there are going to be required and you may miss adding them at that point. For example, the following will probably not do what you first think:

if(condition) a = 5; return 7;

You could rewrite it like this so it works as expected:

if(condition) return a = 5, 7;

but yet again, it's not a good idea. It's really ugly code and long term not safe at all. Write it like this:

if(condition)
{
    a = 5;
    return 7;
}

and everyone will get it.

And of course, in term of coverage, with the one-liner, you again can't know whether the condition was ever true. The coverage test will show you the one-liner as executed even if only the condition was tested and was false.

With those 3 lines of code, the coverage test will tell you exactly whether all three were covered.