Using this Standard. If you want to make a local copy
of this standard and use it as your own you are perfectly free
to do so. That's why I made it! If you find any errors or make any
improvements please email me the changes so I can merge them in.
I also have a programming blog at
http://radio.weblogs.com/0103955/categories/stupidHumanProgramming/
that is accidently interesting at times, as is my wiki at
http://www.possibility.com/epowiki/Wiki.jsp.
Contents
To make comments on this page please see the new Disqus comment section.
Pferor was also nice enough to make a copy of this document available in pdf format.
It helps if the standard annoys everyone in some way so everyone
feels they are on the same playing field. The proposal here
has evolved over many projects, many companies, and literally
a total of many weeks spent arguing. It is no particular person's style
and is certainly open to local amendments.
Good Points
When a project tries to adhere to common standards a few
good things happen:
Programmers can go into any code and figure out what's going on.
New people can get up to speed quickly.
People new to C++ are spared the need to develop a personal
style and defend it to the death.
People new to C++ are spared making the same mistakes over
and over again.
People make fewer mistakes in consistent environments.
Programmers have a common enemy :-)
Bad Points
Now the bad:
The standard is usually stupid because it was made by someone
who doesn't understand C++.
The standard is usually stupid because it's not what I do.
Standards reduce creativity.
Standards are unnecessary as long as people are consistent.
Standards enforce too much structure.
People ignore standards anyway.
Standards can be used as a reason for NIH (not invented here)
because the new/borrowed code won't follow the standard.
Discussion
The experience of many projects leads to the conclusion that using
coding standards makes the project go smoother. Are standards
necessary for success? Of course not. But they help, and we need all the
help we can get! Be honest, most arguments against a particular
standard come from the ego. Few decisions in a reasonable standard really
can be said to be technically deficient, just matters of taste.
So be flexible, control the ego a bit, and remember any project
is fundamentally a team effort.
Standards Enforcement
First, any serious concerns about the standard should be brought
up and worked out within the group. Maybe the standard is not
quite appropriate for your situation. It may have overlooked
important issues or maybe someone in power vehemently
disagrees with certain issues :-)
In any case, once finalized hopefully people will play the adult and
understand that this standard is reasonable, and has been found reasonable
by many other programmers, and therefore is worthy of being followed
even with personal reservations.
Failing willing cooperation it can be made a requirement that
this standard must be followed to pass a code inspection.
Failing that the only solution is a massive tickling party on the
offending party.
Accepting an Idea
It's impossible.
Maybe it's possible, but it's weak and uninteresting.
It is true and I told you so.
I thought of it first.
How could it be otherwise.
If you come to objects with a negative preconception please
keep an open mind. You may still conclude objects are bunk,
but there's a road you must follow to accept something different.
Allow yourself to travel it for a while.
6 Phases of a Project
Enthusiasm
Disillusionment
Panic
A Search for the Guilty
The Punishment of the Innocent
Praise and Honor for the Non-Participants
Flow Chart for Project Decision Making
+---------+
| START |
+---------+
|
V
YES +------------+ NO
+---------------| DOES THE |---------------+
| | DAMN THING | |
V | WORK? | V
+------------+ +------------+ +--------------+ NO
| DON'T FUCK | | DID YOU FUCK |-----+
| WITH IT | | WITH IT? | |
+------------+ +--------------+ |
| | |
| | YES |
| V |
| +------+ +-------------+ +---------------+ |
| | HIDE | NO | DOES ANYONE |<------| YOU DUMBSHIT! | |
| | IT |<----| KNOW? | +---------------+ |
| +------+ +-------------+ |
| | | |
| | V |
| | +-------------+ +-------------+ |
| | | YOU POOR | YES | WILL YOU | |
| | | BASTARD |<------| CATCH HELL? |<-----+
| | +-------------+ +-------------+
| | | |
| | | | NO
| | V V
| V +-------------+ +------------+
+-------------->| STOP |<------| SHITCAN IT |
+-------------+ +------------+
pointers should be prepended by a 'p' in most cases
place the * close to the pointer type not the variable name
Justification
The idea is that the difference between a pointer, object, and a
reference to an object is important for understanding the code,
especially in C++ where -> can be overloaded, and casting
and copy semantics are important.
Pointers really are a change of type so the * belongs near the type.
One reservation with this policy relates to declaring multiple variables
with the same type on the same line. In C++ the pointer modifier only applies
to the closest variable, not all of them, which can be very confusing,
especially for newbies. You want to have one declaration per line anyway
so you can document each variable.
Example
String* pName= new String;
String* pName, name, address; // note, only pName is a pointer.
The problem with the .C extension is that it is indistinguishable from
the .c extensions in operating systems that aren't case sensitive. Yes,
this is a UNIX vs. windows issue. Since it is a simple step aiding portability
we won't use the .C extension. The .cpp extension is a little wordy.
So the .cc extension wins by default.
Comments should document decisions. At every point
where you had a choice of what to do place a comment
describing which choice you made and why. Archeologists
will find this the most useful information.
Use Extractable Headers
Use a document extraction system like Doxygen
when documenting your code.
These headers are structured in such a way as they
can be parsed and extracted. They are not useless
like normal headers. So take time to fill them out.
If you do it right once no more documentation may be
necessary.
As part of your nighlty build system have a step the generates the documentation from the
source. Then index the source using a tool like Lucene. Have a front end to the search
so developers can do full text searches on nightly builds and for release builds.
This is a wonderfully useful feature.
The next step in automation is to front the repository with a web server documentation can
directly refer to a source file with a URL.
Comment All Questions a Programmer May Have When Looking at Your Code
At every point in your code think about what questions a programmer may have about
the code. It's crucial you answer all those questions somehow, someway. If you
don't, as the code writer, answer those questions, who will?
If you think your code is so clear and wonderful that nobody will have any questions
then you are lying to yourself. I have never seen a large system with this wonderful
self-documenting code feature. I've seen very few small libraries are even a single
class that are so wonderfully self-documented.
You have a lot of tools at your disposal to answer questions:
A brain to think up the questions you should be answering. Why? Who? When? How?
What?
Variable names.
Class names.
Class decomposition.
Method decomposition.
File names.
Documentation at all levels: package, class, method, attribute, inline.
The better you are at orchestrating all these elements together the clearer your
code will be to everyone else.
I don't really consider unit tests a question answering device because if you can't
understand the code by reading it, reading something else about the code you don't
understand won't help you understand it better.
Make Your Code Discoverable by Browsing
Programmers should be able to navigate your code by looking at markers in the code,
namely the names and the directory structure. Nothing is more frustrating to than
to have to look at pile of code and have no idea what it's organizing principles are.
Have a logicanl directory structure. Have directories called doc, lib, src, bin, test, pkg,
install, etc and whatever, so I at least have some idea where stuff is. People use the weirdest
names and lump everything together so that it it can be detangles. Clear thought is evidenced
from the beginning by a directory stucture.
Don't put more than one class in a file. Otherwise, how will I know its there when I browse your
code? Should I really need to use search to find every last thing? Can't I just poke around the
code and find it? I can if you organize your code.
Name your files after your classes. I didn't believe this one until I saw it. Why you name
a file different than the class? How I am possibly supposed to know what's in the file otherwise?
Write Comments as You Code
You won't every go back later and document your code. You just won't. Don't lie to yourself, the world,
and your mother by saying that you will.
So when you do something document it right then and there. When you create a class- document it. When
you create a method- document it. And so on. That way when you finish coding you will also be finished
documenting.
I advocate simultaneously writing code, writing tests, and writing documentaiton.
Which comes first depends on you and the problem. I don't think there is any rule that says
which should come first. On the path to getting stuff done I'll take the entrance that seems
easiest to me at the time. Once on the path it's easy to follow the entire trail.
Won't this break the flow? No, I think it improves flow because it keeps you mindful of what
you are doing, why you are doing, and how it fits in the big picture. My take on TDD (test driven
development) is that it's not the tests that are really important, it's that the tests keep you
mindful while programming. A test means you are keeping everything in you mind at once you
need to remember to successfully code something up. As you can't keep large chunks in your mind
then smaller chunks are better. Writing a test forces you to remember what your code is supposed
to accomplish. It's forcing you to also think about the use case/story/intent behind why
you are writing the code.
The result is a pointed mind that has focussed all its powers on doing one thing. When you can
bring that focus to you programming you can be successful. The tests are really secondary. If
your system/acceptance tests can't find bugs you are screwed anyway. And I find code written
mindfully, one step at a time, has very few bugs.
Unit tests are just one definition of a "step." You can use the orignial story you are implementing
as a step as well. I use unit tests more as a mental focussing device while developing, like Zen
Archery, than for the actual tests. After development unit tests are very useful in making sure
code doesn't break. So I am not saying unit tests aren't useful. I just don't think they are the
real reason behind why TDD generates working code. With a clear well functioning focussed mind
we generate working code. But getting into that state is hard.
Writing comments simultaneously with all other aspects of development deepens your mindfulness
because you are thinking about everything at once. All the interconnections are present in your
brain because you are explaining the intent behind what you are doing.
There's a saying that you don't know something until you teach it. Comments are teaching what you
are doing to someone else. When you are writing comments you must generate the thoughts to
teach, to explain to someone else the intent behing what you are doing. It's very difficult
to make a coding error when all this context is hot in your mind.
I'll go back and forth between documenting, testing, and coding. I'll let the problem dictate
what happens when as I am working my way through solving the problem. Saying testing should
always come first is too simple a rule and I think misses the larger point about software
development.
Software is ultimately mind stuff. Using our minds better is the real methodology.
Make Gotchas Explicit
Explicitly comment variables changed out of the normal control
flow or other code likely to break during maintenance. Embedded
keywords are used to point out issues and potential problems. Consider a robot
will parse your comments looking for keywords, stripping them out, and making
a report so people can make a special effort where needed.
Gotcha Keywords
:TODO: topic Means there's more to do here, don't forget.
:BUG: [bugid] topic means there's a
Known bug here, explain it and optionally give a bug ID.
:KLUDGE: When you've done something ugly say so and explain
how you would do it differently next time if you had more time.
:TRICKY: Tells somebody that the following code is very tricky
so don't go changing it without thinking.
:WARNING: Beware of something.
:COMPILER: Sometimes you need to work around a compiler
problem. Document it. The problem may go away eventually.
:ATTRIBUTE: value The general form of an attribute embedded in
a comment. You can make up your own attributes and they'll be
extracted.
Gotcha Formatting
Make the gotcha keyword the first symbol in the comment.
Comments may consist of multiple lines, but the first line
should be a self-containing, meaningful summary.
The writer's name and the date of the remark should be part
of the comment. This information is in the source repository, but
it can take a quite a while to find out when and by whom it was
added. Often gotchas stick around longer than they should.
Embedding date information allows other programmer to make this
decision. Embedding who information lets us know who to ask.
Example
// :TODO: tmh 960810: possible performance problem
// We should really use a hash table here but for now we'll
// use a linear search.
// :KLUDGE: tmh 960810: possible unsafe type cast
// We need a cast here to recover the derived type. It should
// probably use a virtual method or template.
With a little forethought we can extract both types of
documentation directly from source code.
Class Users
Class users need class interface information which when structured
correctly can be extracted directly from a header file. When
filling out the header comment blocks for a class, only
include information needed by programmers who use the class.
Don't delve into algorithm implementation details unless the
details are needed by a user of the class. Consider comments
in a header file a man page in waiting.
Class Implementors
Class implementors require in-depth knowledge of how a class
is implemented. This comment type is found in
the source file(s) implementing a class. Don't worry about
interface issues. Header comment blocks in a source file should
cover algorithm issues and other design decisions. Comment blocks
within a method's implementation should explain even more.
Directory Documentation
Every directory should have a README file that covers:
the purpose of the directory and what it contains
a one line comment on each file. A comment can
usually be extracted from the NAME attribute of the
file header.
cover build and install directions
direct people to related resources:
directories of source
online documentation
paper documentation
design documentation
anything else that might help someone
Consider a new person coming in 6 months after every
original person on a project has gone. That lone scared
explorer should be able to piece together a picture of the
whole project by traversing a source directory tree and
reading README files, Makefiles, and source file headers.
Include Statement Documentation
Include statements should be documented, telling the user why a
particular file was included. If the file includes a class
used by the class then it's useful to specify a class relationship:
ISA - this class inherits from the class in the include file.
HASA - this class contains, that is has as a member attribute,
the class in the include file. This class owns the memory and
is responsible for deleting it.
USES - this class uses something from the include file.
HASA-USES - this class keeps a pointer or reference to the
class in the include file, but this class does not own the
memory.
Example
#ifndef XX_h
#define XX_h
// SYSTEM INCLUDES
//
#include // standard IO interface
#include // HASA string interface
#include // USES auto_ptr
Notice how just by reading the include directives the code is starting
to tell you a story of why and how it was built.
Block Comments
Use comments on starting and ending a Block:
{
// Block1 (meaningful comment about Block1)
... some code
{
// Block2 (meaningful comment about Block2)
... some code
} // End Block2
} // End Block1
This may make block matching much easier to spot when you
don't have an intelligent editor.
Layering is the primary technique for reducing complexity in
a system. A system should be divided into layers. Layers
should communicate between adjacent layers using well defined
interfaces. When a layer uses a non-adjacent layer then a
layering violation has occurred.
A layering violation simply means we have dependency between
layers that is not controlled by a well defined interface.
When one of the layers changes code could break. We don't
want code to break so we want layers to work only with
other adjacent layers.
Sometimes we need to jump layers for performance reasons.
This is fine, but we should know we are doing it and document
appropriately.
Minimize Dependencies with Abstract Base Classes
One of the most important strategies in C++ is to remove dependencies
among different subsystems. Abstract base classes (ABCs) are
a solid technique for dependency removal.
An ABC is an abstraction of a common form such that it can be
used to build more specific forms. An ABC is a common interface
that is reusable across a broad range of similar classes.
By specifying a common interface as long as a class conforming
to that interface is used it doesn't really matter what is the type
of the derived type. This breaks code dependencies. New classes,
conforming to the interface, can be substituted in at will without
breaking code. In C++ interfaces are specified by using base classes
with virtual methods.
The above is a bit rambling because it's a hard idea to
convey. So let's use an example: We are doing a GUI
where things jump around on the screen. One approach
is to do something like:
class Frog
{
public:
void Jump();
}
class Bean
{
public:
void Jump();
}
The GUI folks could instantiate each object and call the Jump
method of each object. The Jump method of each object contains
the implementation of jumping behavior for that type of object.
Obviously frogs and beans jump differently even though both
can jump.
Unfortunately the owner of Bean didn't like the word
Jump so they changed the method name to Leap. This broke the
code in the GUI and one whole week was lost.
Then someone wanted to see a horse jump so a Horse class was added:
class Horse
{
public:
void Jump();
}
The GUI people had to change their code again to add Horse.
Then someone updated Horse so that its Jump behavior was slightly different.
Unfortunately this caused a total recompile of the GUI code and they
were pissed.
Someone got the bright idea of trying to remove all the above
dependencies using abstract base classes. They made one base
class that specified an interface for jumping things:
class Jumpable
{
public:
virtual void Jump() = 0;
}
Jumpable is a base class because other classes need to derive
from it so they can get Jumpable's interface. It's an
abstract base class because one or more of its methods has
the = 0 notation which means the method is a
pure virtual method. Pure virtual methods
must be implemented by derived classes. The compiler
checks.
Not all methods in an ABC must be pure virtual, some may have an
implementation. This is especially true when creating a base class
encapsulating a process common to a lot of objects.
For example, devices that must be opened, diagnostics run, booted,
executed, and then closed on a certain event may create an ABC
called Device that has a method called LifeCycle which calls all other
methods in turn thus running through all phases of a device's life.
Each device phase would have a pure virtual method in the base class
requiring implementation by more specific devices. This way
the process of using a device is made common but the specifics
of a device are hidden behind a common interface.
Back to Jumpable. All the classes were changed to derive
from Jumpable:
class Frog : public Jumpable
{
public:
virtual void Jump() { ... }
}
etc ...
We see an immediate benefit: we know all classes derived from
Jumpable must have a Jump method. No one can go changing
the name to Leap without the compiler complaining. One dependency
broken.
Another benefit is that we can pass Jumpable objects
to the GUI, not specific objects like Horse or Frog:
class Gui
{
public:
void MakeJump(Jumpable*);
}
Gui gui;
Frog* pFrog= new Frog;
gui.MakeJump(pFrog);
Notice Gui doesn't even know it's making a frog jump, it
just has a jumpable thing, that's all it cares about. When
Gui calls the Jump method it will get the implementation
for Frog's Jump method. Another dependency down. Gui
doesn't have to know what kind of objects are jumping.
We also removed the recompile dependency.
Because Gui doesn't contain any Frog objects it will not
be recompiled when Frog changes.
Downside
Wow! Great stuff! Yes but there are a few downsides:
Overhead for Virtual Methods
Virtual methods have a space and time penalty. It's not huge,
but should be considered in design.
Make Everything an ABC!
Sometimes people overdo it, making everything an ABC.
The rule is make an ABC when you need one not when you
might need one. It takes effort to design a good ABC,
throwing in a virtual method doesn't an ABC make. Pick
and choose your spots. When some process or
some interface can be reused and people will actually
make use of the reuse then make an ABC and don't look back.
Liskov's Substitution Principle (LSP)
This principle states:
All classes derived from a base class should be interchangeable
when used as a base class.
The idea is users of a class should be able to count on similar
behavior from all classes that derive from a base class. No
special code should be necessary to qualify an object before
using it. If you think about it violating LSP is also violating
the Open/Closed principle because the code
would have to be modified every time a derived class was added.
It's also related to dependency management using
abstract base classes.
For example, if the Jump method of a
Frog object implementing the Jumpable interface actually makes
a call and orders pizza we can say its implementation is not in the
spirit of Jump and probably all other objects implementing Jump.
Before calling a Jump method a programmer would now have to check
for the Frog type so it wouldn't screw up the system. We don't
want this in programs. We want to use base classes and feel
comfortable we will get consistent behaviour.
LSP is a very restrictive idea. It constrains implementors quite
a bit. In general people support LSP and have LSP as a goal.
Open/Closed Principle
The Open/Closed principle states a class must be open and
closed where:
open means a class has the ability to be extended.
closed means a class is closed for modifications other than extension.
The idea is once a class has been approved for use having gone
through code reviews, unit tests, and other qualifying
procedures, you don't want to change the class very much, just extend it.
The Open/Closed principle is a pitch for stability. A system is extended by adding
new code not by changing already working code. Programmers often don't feel
comfortable changing old code because it works! This principle just gives
you an academic sounding justification for your fears :-)
In practice the Open/Closed principle simply means making good use of our
old friends abstraction and polymorphism. Abstraction to factor out common
processes and ideas. Inheritance to create an interface that must be adhered
to by derived classes. In C++ we are talking about using
abstract base classes . A lot.
Register/Dispatch Idiom
Another strategy for reducing dependencies in a system is
the Register/Dispatch Idiom (RDI). RDI treats large
grained occurrences in a system as events. Events are identified
by some unique identifier. Objects in the system register with
a dispatch system for events or classes of events it is interested
in. Objects that are event sources send events into the dispatch
system so the dispatch system can route events to consumers.
RDI separates producers and consumers on a distributed scale.
Event producers and consumers don't have to know about each
other at all. Consumers can drop out of the event stream
by deregistering for events. New consumers can register
for events at anytime. Event producers can drop out with
no ill effect to event consumers, the consumer just won't
get any more events. It is a good idea for producers to have
an "I'm going down event" so consumers can react intelligently.
Logically the dispatch system is a central entity. The implementation
however can be quite different. For a highly distributed system
a truly centralized event dispatcher would be a performance bottleneck
and a single point of failure. Think of event dispatchers as being
a lot of different processes cast about on various machines for redundancy
purposes. Event processors communicate amongst each other to distribute
knowledge about event consumers and producers. Much like a routing
protocol distributes routing information to its peers.
RDI works equally well in the small, in processes and single workstations.
Parts of the system can register as event consumers and event producers
making for a very flexible system. Complex decisions in a system are expressed
as event registrations and deregistrations. No further level of cooperation
required.
More expressive event filters can also be used. The above proposal filters
events on some unique ID. Often you want events filtered on more complex
criteria, much like a database query. For this to work the system has to
understand all data formats. This is easy if you use a common format like
attribute value pairs. Otherwise each filter needs code understanding
packet formats. Compiling in filter code to each dispatcher is one approach.
Creating a downloadable generic stack based filter language has been used with
success on other projects, being both simple and efficient.
Delegation
Delegation is the idea of a method using another object's
method to do the real work. In some sense the top layer method
is a front for the other method. Delegation is a form
of dependency breaking. The top layer method never has to
change while it's implementation can change at will.
Delegation is an alternative to using inheritance for
implementation purposes. One can use inheritance
to define an interface and delegation to implement
the interface.
Some people feel delegation is a more robust form
of OO than using implementation inheritance. Delegation
encourages the formation of abstract class interfaces
and HASA relationships. Both of which encourage reuse
and dependency breaking.
In this example a test taker delegates actually answering the
question to a paid test taker. Not ethical but a definite
example of delegation!
Follow the Law of Demeter
The Law of Demeter states (Wikipedia):
An object A can request a service (call a method) of an object instance B,
but object A cannot �reach through� object B to access yet another object
to request its services. Doing so would mean that object A implicitly requires
greater knowledge of object B�s internal structure. Instead, B�s class should
be modified if necessary so that object A can simply make the request directly
of object B, and then let object B propagate the request to any relevant
subcomponents. If the law is followed, only object B knows its internal structure.
Justification
The purpose of this law is to break dependencies
so implementations can change without breaking code.
If an object wishes to remove one of its contained objects
it won't be able to do so because some other object is using it.
If instead the service was through an interface the object
could change its implementation anytime without ill effect.
Caveat
As for most laws the Law of Demeter should be ignored in
certain cases. If you have a really high level object
that contains a lot of subobjects, like a car contains
thousands of parts, it can get absurd to created a method
in car for every access to a subobject.
The idea of design by contract is strongly related to LSP .
A contract is a formal statement of what to expect from another party. In this case
the contract is between pieces of code. An object and/or method states that it does
X and you are supposed to believe it. For example, when you ask an object for its
volume that's what you should get. And because volume is a verifiable attribute
of a thing you could run a series of checks to verify volume is correct, that is,
it satisfies its contract.
The contract is enforced in languages like
Eiffel by pre and post condition statements that are actually part of the language.
In other languages a bit of faith is needed.
Design by contract when coupled with language based verification mechanisms
is a very powerful idea. It makes programming more like assembling spec'd parts.
Using Design by Contract
DO NOT PUT "REAL" CODE IN DBC CALLS.. Dbc calls should
only test conditions. No code that can't be compiled out should
be included in Dbc calls.
Every method should define its pre and post conditions.
Every class should define its invariants.
Callers are responsible for checking preconditions.
An object may not and is not required to test for assertion violations.
Method pre-conditions should be documented in a method's
interface documentation.
Pre-conditions can be weakened by derived classes.
Post-conditions can be strengthened by derived classes.
Code its invariants and call them in its operations.
Document its invariants in the class documentation.
Yes, this takes a lot of work, but high availibility
is the system's __primary goal__,
meeting this goal requires a lot of work and effort by each
programmer.
Document the exceptions, pre, and post conditions.
Yes, this takes a lot of work, but high availibility
is the system's __primary goal__,
meeting this goal requires a lot of work and effort by each
programmer.
Each class definition should be in its own file where each file is named directly
after the class's name:
ClassName.h
Implementation in One File
In general each class should be implemented in one source file:
ClassName.cc // or whatever the extension is: cpp, c++
But When it Gets Really Big...
If the source file gets too large or you want to avoid compiling
templates all the time then add additional files named according
to the following rule:
ClassName_section.C
section is some name that identifies why the code is chunked
together. The class name and section name are separated by '_'.
It is recommended a program like
Doxygen
be used to document C++ classes, method, variables, functions,
and macros. The documentation can be extracted and put in places
in a common area for all programmers to access. This saves programmers
having to read through class headers. Documentation generation
should be integrated with the build system where possible.
Template
Please use the following template when creating a new class.
/** A one line description of the class.
*
* #include "XX.h" <BR>
* -llib
*
* A longer description.
*
* @see something
*/
#ifndef XX_h
#define XX_h
// SYSTEM INCLUDES
//
// PROJECT INCLUDES
//
// LOCAL INCLUDES
//
// FORWARD REFERENCES
//
class XX
{
public:
// LIFECYCLE
/** Default constructor.
*/
XX(void);
/** Copy constructor.
*
* @param from The value to copy to this object.
*/
XX(const XX& from);
/** Destructor.
*/
~XX(void);
// OPERATORS
/** Assignment operator.
*
* @param from THe value to assign to this object.
*
* @return A reference to this object.
*/
XX& operator=(const XX& from);
// OPERATIONS
// ACCESS
// INQUIRY
protected:
private:
};
// INLINE METHODS
//
// EXTERNAL REFERENCES
//
#endif // _XX_h_
Required Methods Placeholders
The template has placeholders for required methods .
You can delete them or implement them.
Ordering is: public, protected, private
Notice that the public interface is placed first in the class, protected next,
and private last. The reasons are:
programmers should care about a class's interface more than implementation
when programmers need to use a class they need the interface not the
implementation
It makes sense then to have the interface first. Placing implementation,
the private section, first is a historical accident as the first examples
used the private first layout. Over time emphasis has switched deemphasizing
a class's interface over implementation details.
LIFECYCLE
The life cycle section is for methods that control the life cycle of an
object. Typically these methods include constructors, destructors, and
state machine methods.
OPERATORS
Place all operators in this section.
OPERATIONS
Place the bulk of a class's non access and inquiry method methods here.
A programmer will look here for the meat of a class's interface.
ACCESS
Place attribute accessors here.
INQUIRY
These are the Is* methods. Whenever you have a question to ask about
an object it can be asked via in Is method. For example: IsOpen()
will indicate if the object is open. A good strategy is instead of making
a lot of access methods you can turn them around to be questions about the
object thus reducing the exposure of internal structure. Without the IsOpen()
method we might have had to do: if (STATE_OPEN == State()) which is much uglier.
class Planet
{
public:
/** The following is the default constructor if no arguments are supplied.
*/
Planet(int radius= 5);
// Use compiler-generated copy constructor, assignment, and destructor.
// Planet(const Planet&);
// Planet& operator=(const Planet&);
// ~Planet();
};
Every parameter should be documented. Every return code should
be documented. All exceptions should be documented.
Use complete sentences when describing attributes.
Make sure to think about what other resources developers
may need and encode them in with the @see attributes.
/** Assignment operator.
*
*
* @param val The value to assign to this object.
* @exception LibaryException The explanation for the exception.
* @return A reference to this object.
*/
XX& operator=(XX& val);
Additional Sections
In addition to the standard attribute set, the following
sections can be included in the documentation:
PRECONDITION
Document what must have happened for the object to be in a state
where the method can be called.
WARNING
Document anything unusual users should know about this method.
LOCK REQUIRED
Some methods require a semaphore be acquired before using the method. When
this is the case use lock required and specify the name of the lock.
EXAMPLES
Include exampes of how to use a method. A picture says a 1000 words,
a good example answers a 1000 questions.
For example:
/** Copy one string to another.
*
* PRECONDITION
* REQUIRE(from != 0)
* REQUIRE(to != 0)
*
* WARNING
* The to buffer must be long enough to hold
* the entire from buffer.
*
* EXAMPLES
*
* strcpy(somebuf, "test")
*
*
* @param from The string to copy.
* @param to The buffer to copy the string to.
*
* @return void
*/
void strcpy(const char* from, char* to);
Common Exception Sections
If the same exceptions are being used in a number of
methods, then the exceptions can be documented once
in the class header and referred to from the method
documentation.
Formatting Methods with Multiple Arguments
We should try and make methods have as few parameters as possible. If you find yourself
passing the same variables to every method then that variable should probably be
part of the class. When a method does have a lot of parameters format it
like this:
int AnyMethod(
int arg1,
int arg2,
int arg3,
int arg4);
Objects with multiple constructors and/or multiple attributes
should define a private Init() method to initialize
all attributes. If the number of different member variables
is small then this idiom may not be a big win and C++'s
constructor initialization syntax can/should be used.
Justification
When using C++'s ability to initialize variables in the constructor
it's difficult with multiple constructors and/or multiple
attributes to make sure all attributes are initialized. When an attribute
is added or changed almost invariably we'll miss changing a
constructor.
It's better to define one method, Init(), that initializes
all possible attributes. Init() should be called first from every
constructor.
The Init() idiom cannot be used in two cases where initialization
from a constructor is required:
constructing a member object
initializing a member attribute that is a reference
Example
class Test
{
public:
Test()
{
Init(); // Call to common object initializer
}
Test(int val)
{
Init(); // Call to common object initializer
mVal= val;
}
private:
int mVal;
String* mpName;
void Init()
{
mVal = 0;
mpName= 0;
}
}
Since the number of member variables is small, this might be better
written as:
class Test
{
public:
Test(int val = 0, String* name = 0)
: mVal(val), mpName(name) {}
private:
int mVal;
String* mpName;
};
Initialize all Variables
You shall always initialize variables. Always. Every time.
Justification
More problems than you can believe are eventually traced
back to a pointer or variable left uninitialized. C++
tends to encourage this by spreading initialization to each constructor.
See Init Idiom for Initializing Objects .
Minimize Inlines
Minimize inlining in declarations or inlining in general. As soon as you
put your C++ code in a shared library which you want to maintain compatibility
with in the future, inlined code is a major pain in the butt. It's not worth
it, for most cases.
The constructor code must still be very careful not to leak resources
in the constructor. It's possible to throw an exception and not
destruct objects allocated in the constructor.
There is a pattern called Resource Acquisition as Initialization
that says all initialization is performed in the constructor and
released in the destructor. The idea is that this is a safer approach
because it should reduce resource leaks.
How many methods should an object have? The right answer of course is just the right amount, we'll call
this the Goldilocks level. But what is the Goldilocks level? It doesn't
exist. You need to make the right judgment for your situation, which is really
what programmers are for :-)
The two extremes are thin classes versus thick classes. Thin
classes are minimalist classes. Thin classes have as few methods as possible.
The expectation is users will derive their own class from the thin class adding
any needed methods.
While thin classes may seem "clean" they really aren't. You can't do much with
a thin class. Its main purpose is setting up a type. Since thin classes have so
little functionality many programmers in a project will create derived classes
with everyone adding basically the same methods. This leads to code duplication
and maintenance problems which is part of the reason we use objects
in the first place. The obvious solution is to push methods up to the base class.
Push enough methods up to the base class and you get thick classes.
Thick classes have a lot of methods. If you can think of it a thick class
will have it. Why is this a problem? It may not be. If the methods are directly
related to the class then there's no real problem with the class containing
them. The problem is people get lazy and start adding methods to a class that
are related to the class in some willow wispy way, but would be better factored
out into another class. Judgment comes into play again.
Thick classes have other problems. As classes get larger
they may become harder to understand. They also become harder to debug
as interactions become less predictable. And when a method is changed that
you don't use or care about your code will still have to be recompiled, possibly
retested, and rereleased.
Programmers need to have a common language for talking
about coding, designs, and the software process in general.
This is critical to project success.
Any project brings together people of widely varying skills,
knowledge, and experience. Even if everyone on a project
is a genius you will still fail because people will
endlessly talk past each other because there is
no common language and processes binding the project together.
All you'll get is massive fights, burnout, and little progress.
If you send your group to training they may not come back seasoned
experts but at least your group will all be on the same page;
a team.
There are many popular methodologies out there. The
point is to do some research, pick a method, train
your people on it, and use it. Take a look at the top of
this page for links to various methodologies.
You may find the CRC (class responsibility cards) approach to
teasing out a design useful. Many others have. It is an informal
approach encouraging team cooperation and focusing on
objects doing things rather than objects having attributes. There's
even a whole book on it: Using CRC Cards by Nancy M. Wilkinson.
The Unified Modeling Language is too large to present here. Fortunately
you can see it at Rational's
web site. Since you do need a modeling language UML is a safe choice. It
combines features from several methods into one unified language. Remember
all languages and methods are open to local customization. If their language
is too complex then use the parts you and your project feel they need and
junk the rest.
Code Reviews
If you can make a formal code review work then my hat is off
to you. Code reviews can be very useful. Unfortunately they
often degrade into nit picking sessions and endless arguments
about silly things. They also tend to take a lot of people's
time for a questionable payback.
My god he's questioning code reviews, he's not an engineer!
Not really, it's the form of code reviews and how they fit into
normally late chaotic projects is what is being questioned.
First, code reviews are way too late to do much of
anything useful. What needs reviewing are requirements and
design. This is where you will get more bang for the buck.
Get all relevant people in a room. Lock them in. Go over the class design
and requirements until the former is good and the latter is being met.
Having all the relevant people in the room makes this process
a deep fruitful one as questions can be immediately answered and
issues immediately explored. Usually only a couple of such meetings
are necessary.
If the above process is done well coding will take
care of itself. If you find problems in the code
review the best you can usually do is a rewrite after someone has sunk
a ton of time and effort into making the code "work."
You will still want to do a code review, just do it offline. Have a
couple people you trust read the code in question and simply make
comments to the programmer. Then the programmer and reviewers
can discuss issues and work them out. Email and quick pointed
discussions work well. This approach meets the goals
and doesn't take the time of 6 people to do it.
For more information on code reviews please take a look at
here.
You'll find a lot of information on justifying code reviews if you are having
troubles instituting them and lots of suggestions on how to conduct them.
Jira - a full featured and reliable bug tracking system.
It works best when combined with Confluence, their wiki product.
Bugzilla - a free product that is functional
and widely used.
As for source code control systems there are many available bug tracking systems.
It's more important that you use one than which one you use.
RCS Keywords, Change Log, and History Policy
When using RCS directly this policy must change, but when
using other source code control systems like CVS that
support RCS style keywords:
Do not use RCS keywords within files.
Do not keep a change history in files.
Do not keep author information in files.
Justification
The reasoning is your source control system already keeps
all this information. There is no reason to clutter up
source files with duplicate information that:
makes the files larger
makes doing diffs difficult as non source code lines change
makes the entry into the file dozens of lines lower in the
file which makes a search or jump necessary for each file
is easily available from the source code control system
and does not need embedding in the file
When files must be sent to other organizations the comments
may contain internal details that should not be exposed to
outsiders.
Responsibility for software modules is scoped. Modules are either the responsibility of a
particular person or are common. Honor this division of responsibility. Don't
go changing things that aren't your responsibility to change. Only mistakes
and hard feelings will result.
Face it, if you don't own a piece of code you can't possibly be in a position to
change it. There's too much context. Assumptions seemingly reasonable to you
may be totally wrong. If you need a change simply ask the responsible person
to change it. Or ask them if it is OK to make such-n-such a change. If they say OK
then go ahead, otherwise holster your editor.
Every rule has exceptions. If it's 3 in the morning and you need to make a change
to make a deliverable then you have to do it. If someone is on vacation and no one
has been assigned their module then you have to do it. If you make changes in other
people's code try and use the same style they have adopted.
Programmers need to mark with comments code that is particularly sensitive to
change. If code in one area requires changes to code in an another area then
say so. If changing data formats will cause conflicts with persistent stores
or remote message sending then say so. If you are trying to minimize memory
usage or achieve some other end then say so. Not everyone is as brilliant as you.
The worst sin is to flit through the system changing bits of code to match your
coding style. If someone isn't coding to the standards then ask them or ask
your manager to ask them to code to the standards. Use common courtesy.
Code with common responsibility should be treated with care. Resist making radical
changes as the conflicts will be hard to resolve. Put comments in the file on how
the file should be extended so everyone will follow the same rules. Try and use
a common structure in all common files so people don't have to guess on where
to find things and how to make changes. Checkin changes as soon as possible so
conflicts don't build up.
As an aside, module responsibilities must also be assigned for bug tracking purposes.
Process Automation
It's a sad fact of human nature that if you don't measure it or check for
it: it won't happen. The implication is you must automate as much of
the development process as possible and provide direct feedback
to developers on specific issues that they can fix.
Process automation also frees up developers to do real work because they
don't have to babysit builds and other project time sinks.
Automated Builds and Error Assignment
Create an automated build system that can create nightly builds, parse
the build errors, assign the errors to developers, and email developers
their particular errors so they can fix them.
This is the best way to maintain a clean build. Make sure the list
of all errors for a build is available for everyone to see so
everyone can see everyone elses errors. The goal is replace
a blaim culture with a culture that tries to get things right
and fixes them when they are wrong. Immediate feedback makes this
possible.
Automated Code Checking
As part of the automated build process you can check for coding standard
violations and for other problems. If you don't check for it people
will naturally do their own thing. Code reviews aren't good enough
to keep the code correct. With a tool like
Abraxis Code Check
you can check the code for a lot of potential problems.
This feature like the automated error assignment makes problems immediately
visible and immediately correctable, all without a lot of blame
and shame.
Documentation Extraction
Related to this principle is the need to automatically extract documentation
from the source code and make it available on line for everyone to use.
If you don't do this documentation will be seen as generally useless and
developers won't put as much effort into it. Making the documentation
visible encourages people to do a better job.
Connect Source Code Control System and Bug Tracking System
When a check-in of source code fixes a bug then have the check-in
automatically tell the bug tracking system that the bug was fixed.
When a bug fix is built in a build change the state of the bug to
BUILT.
Have a submit tool where people ask judges of they can submit
a bug fix on the branch.
There are lots of things you can do depending on how complicated
your environment. The more complicated the environment the more
you should think about connecting all systems together.
Tools Agreement
The reality of different tool preferences is something to deal with
explicitly and openly. Tools include IDEs, languages, editors, make program,
source code control, bug system, debuggers, test framework, etc. Some tool
decisions by their nature must be project wide, other decisions
can be customized per developer.
A split might also be done by who is performing the build. For example, an
IDE should be able to used in local builds, but the make program would
be used for nightly and release builds.
Certain things are easy/trivial/useful with one tool, but
hard/complicated/stupid with another tool. Unstated tool assumptions
can be the source of a lot of confusion.
"Get a better editor" is not always a workable response, though
sometimes that's all there is to it!
Non-Blocking Scheduling
Schedules are lies. Schedules suck. Yes, yes, yes.
But we still need them.
The most effective scheduling rule i've used is to schedule so
as to unblock others. The idea is to complete the portions of
a feature that will unblock those dependent on you. This way
development moves along smoothly because more lines of development
can be active at a time. For example, instead of implementing
the entire database, implement the simple interface and stub
it out. People can work for a very long time this way using
that portion of the feature that caused others not to block.
Plus it's a form of rapid prototyping because you get immediate
feedback on these parts. Don't worry about the quality of
implementation because it doesn't matter yet.
Using Personas
Personas are a powerful design tool, especially when combined with
responsibility driven design.
Cooper's personas are:.
simply pretend users of the system you're building. You describe
them, in a surprising amount of detail, and then design your
system for them.
I have a standard set of personas that i consider when creating
a design/architecture that don't seem to be common. When you write
code their are a lot of personas looking over your shoulder:
other programmers using the code
maintenance
extension
documentation group
training group
code review
test and validation
manufacturing
field support
first and second line technical support
live debugging
post crash debugging
build system (documentation generation and automatic testing)
unit testing
system testing
source code control
code readers
legal
You are much more careful and more thorough when you really thing about
all the personas, all the different people and all their
different roles and purposes.
Of the three major brace placement strategies two are acceptable,
with the first one listed being preferable:
Place brace under and inline with keywords:
if (condition) while (condition)
{ {
... ...
} }
Traditional Unix policy of placing the initial brace on the
same line as the keyword and the trailing brace inline on its
own line with the keyword:
if (condition) { while (condition) {
... ...
} }
Justification
Another religious issue of great debate solved by compromise.
Either form is acceptable, many people, however, find the first
form more pleasant. Why is the topic of many psychological
studies.
There are more reasons than psychological for preferring the first style.
If you use an editor (such as vi) that supports brace matching, the first
is a much better style. Why? Let's say you have a large block of code
and want to know where the block ends. You move to the first brace hit
a key and the editor finds the matching brace. Example:
if (very_long_condition && second_very_long_condition)
{
...
}
else if (...)
{
..
}
To move from block to block you just need to use cursor down and your
brace matching key. No need to move to the end of the line to match
a brace then jerk back and forth.
When Braces are Needed
All if, while and do statements must either have braces or be on a single
line.
Always Uses Braces Form
All if, while and do statements require braces even if there is only a
single statement within the braces. For example:
if (1 == somevalue)
{
somevalue = 2;
}
Justification
It ensures that when someone adds a line of code later there are
already braces and they don't forget. It provides a more consistent look.
This doesn't affect execution speed. It's easy to do.
One Line Form
if (1 == somevalue) somevalue = 2;
Justification
It provides safety when adding new lines while maintainng a compact
readable form.
Add Comments to Closing Braces
Adding a comment to closing braces can help when you are
reading code because you don't have to find the begin brace
to know what is going on.
while(1)
{
if (valid)
{
} // if valid
else
{
} // not valid
} // end forever
Consider Screen Size Limits
Some people like blocks to fit within a common screen size
so scrolling is not necessary when reading code.
Indentation/Tabs/Space Policy
Indent using 3, 4, or 8 spaces for each level.
Do not use tabs, use spaces. Most editors can substitute
spaces for tabs.
Tabs should be fixed at 8 spaces. Don't set tabs to a different spacing,
uses spaces instead.
Indent as much as needed, but no more. There are no arbitrary
rules as to the maximum indenting level. If the indenting level
is more than 4 or 5 levels you may think about factoring out
code.
Justification
Tabs aren't used because 8 space indentation severely limits the number
of indentation levels one can have. The argument that if this is a problem
you have too many indentation levels has some force, but real code can
often be three or more levels deep. Changing a tab to be less than 8 spaces is a problem
because that setting is usually local. When someone prints the source
tabs will be 8 characters and the code will look horrible. Same for
people using other editors. Which is why we use spaces...
When people using different tab settings the code is impossible
to read or print, which is why spaces are preferable to tabs.
Nobody can ever agree on the correct number of spaces, just be
consistent. In general people have found 3 or 4 spaces per indentation
level workable.
As much as people would like to limit the maximum indentation
levels it never seems to work in general. We'll trust that
programmers will choose wisely how deep to nest code.
Example
void
func()
{
if (something bad)
{
if (another thing bad)
{
while (more input)
{
}
}
}
}
Parens () with Key Words and Functions Policy
Do not put parens next to keywords. Put a space between.
Do put parens next to function names.
Do not use parens in return statements when it's not necessary.
Justification
Keywords are not functions. By putting parens next to keywords
keywords and function names are made to look alike.
Example
if (condition)
{
}
while (condition)
{
}
strcpy(s, s1);
return 1;
A Line Should Not Exceed 78 Characters
Lines should not exceed 78 characters.
Justification
Even though with big monitors we stretch windows wide
our printers can only print so wide. And we still need
to print code.
The wider the window the fewer windows we can have on a screen.
More windows is better than wider windows.
We even view and print diff output correctly on all terminals and
printers.
If Then Else Formatting
Layout
It's up to the programmer. Different bracing styles will yield
slightly different looks. One common approach is:
if (condition) // Comment
{
}
else if (condition) // Comment
{
}
else // Comment
{
}
If you have else if statements then it is usually a good idea
to always have an else block for finding unhandled cases. Maybe put a log
message in the else even if there is no corrective action taken.
Condition Format
Always put the constant on the left hand side of an equality/inequality
comparison. For example:
if ( 6 == errorNum ) ...
One reason is that if you leave out one of the = signs, the compiler will
find the error for you. A second reason is that it puts the value you are
looking for right up front where you can find it instead of buried at the
end of your expression. It takes a little time to get used to this
format, but then it really gets useful.
Continue and break are really disguised gotos so they are covered
here.
Continue and break like goto should be used sparingly as they are magic in
code. With a simple spell the reader is beamed to god knows where for
some usually undocumented reason.
The two main problems with continue are:
It may bypass the test condition
It may bypass the increment/decrement expression
Consider the following example where both problems occur:
while (TRUE)
{
...
// A lot of code
...
if (/* some condition */) {
continue;
}
...
// A lot of code
...
if ( i++ > STOP_VALUE) break;
}
Note: "A lot of code" is necessary in order that the problem cannot be
caught easily by the programmer.
From the above example, a further rule may be given:
Mixing continue with break in the same loop is a sure way to disaster.
?:
The trouble is people usually try and stuff too much code
in between the ? and :. Here are a couple of
clarity rules to follow:
Put the condition in parens so as to set it off from other code
If possible, the actions for the test should be simple functions.
Put the action for the then and else statement on a separate line
unless it can be clearly put on one line.
Example
(condition) ? funct1() : func2();
or
(condition)
? long statement
: another long statement;
One Statement Per Line
There should be only one statement per line unless the statements
are very closely related.
The reasons are:
The code is easier to read. Use some white space too. Nothing better
than to read code that is one line after another with no white space
or comments.
One Variable Per Line
Related to this is always define one variable per line:
An object is presumably created to do something. Some of the
changes made by an object should persist after an object dies
(is destructed) and some changes should not. Take an object
implementing a SQL query. If a database field is updated via
the SQL object then that change should persist after the SQL
objects dies. To do its work the SQL object probably created
a database connection and allocated a bunch of memory.
When the SQL object dies we want to close the database connection
and deallocate the memory, otherwise if a lot of SQL objects
are created we will run out of database connections and/or memory.
The logic might look like:
Sql::~Sql()
{
delete connection;
delete buffer;
}
Let's say an exception is thrown while deleting the database connection.
Will the buffer be deleted? No. Exceptions are basically non-local
gotos with stack cleanup. The code for deleting the buffer will never
be executed creating a gaping resource leak.
Special care must be taken to catch exceptions which may occur during
object destruction. Special care must also be taken to fully destruct
an object when it throws an exception.
Usign RAII can help prevent many of not most
of these type of errors.
This section contains some miscellaneous do's and don'ts.
Don't use floating-point variables where discrete values are needed. Using
a float for a loop counter is a great way to shoot yourself in the foot.
Always test floating-point numbers as <= or >=, never use an exact
comparison (== or !=).
Compilers have bugs. Common trouble spots include structure assignment and
bit fields. You cannot generally predict which bugs a compiler has.
You could write a program that avoids all constructs that are known
broken on all compilers. You won't be able to write anything useful,
you might still encounter bugs, and the compiler might get fixed in
the meanwhile. Thus, you should write ``around'' compiler bugs only
when you are forced to use a particular buggy compiler.
Do not rely on automatic beautifiers. The main person who benefits from
good program style is the programmer him/herself, and especially in
the early design of handwritten algorithms or pseudo-code.
Automatic beautifiers can only be applied to complete, syntactically
correct programs and hence are not available when the need for
attention to white space and indentation is greatest. Programmers
can do a better job of making clear the complete visual layout of a
function or file, with the normal attention to detail of a careful
programmer (in other words, some of the visual layout is dictated by
intent rather than syntax and beautifiers cannot read minds). Sloppy
programmers should learn to be careful programmers instead of relying
on a beautifier to make their code readable. Finally, since beautifiers
are non-trivial programs that must parse the source, a sophisticated
beautifier is not worth the benefits gained by such a program.
Beautifiers are best for gross formatting of machine-generated code.
Accidental omission of the second ``='' of the logical compare is a
problem. The following is confusing and prone to error.
if (abool= bbool) { ... }
Does the programmer really mean assignment here? Often yes,
but usually no. The solution is to just not do it, an inverse
Nike philosophy. Instead use explicit tests and avoid
assignment with an implicit test.
The recommended form is to do the assignment before doing the test:
abool= bbool;
if (abool) { ... }
Modern compilers will put variables in registers automatically. Use the
register sparingly to indicate the variables that you think are most
critical. In extreme cases, mark the 2-4 most critical values as
register and mark the rest as REGISTER. The latter can be #defined
to register on those machines with many registers.
Be Const Correct
C++ provides the const key word to allow passing as parameters
objects that cannot change to indicate when a method doesn't modify
its object. Using const in all the right places is called
"const correctness."
It's hard at first, but using const really tightens
up your coding style. Const correctness grows on you.
If you don't use const correctness from the start it can be nightmare
to add it in later because it causes a chain reaction of needing
const everywhere. It's better to start being const correct from
the start or you probably won't be.
You can always cast aways constness when necessary, but it's better
not to.
For more information see Const Correctness in the C++ FAQ.
Programmers transitioning from C to C++ find stream IO strange
preferring the familiarity of good old stdio. Printf and gang
seem to be more convenient and are well understood.
Type Safety
Stdio is not type safe, which is one of the reasons you
are using C++, right? Stream IO is type safe. That's one good
reason to use streams.
Standard Interface
When you want to dump an object to a stream there is
a standard way of doing it: with the << operator.
This is not true of objects and stdio.
Interchangeablity of Streams
One of the more advanced reasons for using streams is that
once an object can dump itself to a stream it can dump itself
to any stream. One stream may go to the screen, but another stream
may be a serial port or network connection. Good stuff.
Streams Got Better
Stream IO is not perfect. It is however a lot better than
it used to be. Streams are now standardized, acceptably efficient,
more reliable, and now there's lots of documentation on how to use
streams.
Check Thread Safety
Some stream implementations are not yet thread safe. Make sure
that yours is.
But Not Perfect
For an embedded target tight on memory streams do not make sense.
Streams inline a lot of code so you might find the image
larger than you wish. Experiment a little. Streams might
work on your target.
No Magic Numbers
A magic number is a bare naked number used in source code. It's magic
because no-one has a clue what it means including the author inside
3 months. For example:
if (22 == foo) { start_thermo_nuclear_war(); }
else if (19 == foo) { refund_lotso_money(); }
else if (16 == foo) { infinite_loop(); }
else { cry_cause_im_lost(); }
In the above example what do 22 and 19 mean? If there was a number change or the
numbers were just plain wrong how would you know?
Instead of magic numbers use a real name that means something. You can use
#define or constants or enums as names. Which one is a design choice. For example:
#define PRESIDENT_WENT_CRAZY (22)
const int WE_GOOFED= 19;
enum
{
THEY_DIDNT_PAY= 16
};
if (PRESIDENT_WENT_CRAZY == foo) { start_thermo_nuclear_war(); }
else if (WE_GOOFED == foo) { refund_lotso_money(); }
else if (THEY_DIDNT_PAY == foo) { infinite_loop(); }
else { happy_days_i_know_why_im_here(); }
Now isn't that better? The const and enum options are preferable
because when debugging the debugger has enough information to display
both the value and the label. The #define option just shows up as
a number in the debugger which is very inconvenient. The const
option has the downside of allocating memory. Only you know if this
matters for your application.
Error Return Check Policy
Check every system call for an error return, unless you know
you wish to ignore errors. For example, printf
returns an error code but rarely would you check for its
return code. In which case you can cast the return
to (void) if you really care.
Include the system error text for every system error message.
Check every call to malloc or realloc unless you know your
versions of these calls do the right thing. You might want to have
your own wrapper for these calls, including new, so you can do
the right thing always and developers don't have to make
memory checks everywhere.
To Use Enums or Not to Use Enums
C++ allows constant variables, which should deprecate the
use of enums as constants. Unfortunately, in most compilers
constants take space. Some compilers will remove constants,
but not all. Constants taking space precludes them from being
used in tight memory environments like embedded systems.
Workstation users should use constants and ignore the rest
of this discussion.
In general enums are preferred to #define as
enums are understood by the debugger.
Be aware enums are not of a guaranteed size. So if you have a
type that can take a known range of values and it is transported
in a message you can't use an enum as the type. Use the correct
integer size and use constants or #define. Casting
between integers and enums is very error prone as you could
cast a value not in the enum.
A C++ Workaround
C++ allows static class variables. These variables are available
anywhere and only the expected amount of space is taken.
Example
class Variables
{
public:
static const int A_VARIABLE;
static const int B_VARIABLE;
static const int C_VARIABLE;
}
Macros
Don't Turn C++ into Pascal
Don't change syntax via macro substitution. It makes the program
unintelligible to all but the perpetrator.
Replace Macros with Inline Functions
In C++ macros are not needed for code efficiency. Use inlines.
Example
#define MAX(x,y) (((x) > (y) ? (x) : (y)) // Get the maximum
The macro above can be replaced for integers with the following inline function
with no loss of efficiency:
inline int
max(int x, int y)
{
return (x > y ? x : y);
}
Be Careful of Side Effects
Macros should be used with caution because of the potential for error when
invoked with an expression that has side effects.
Example
MAX(f(x),z++);
Always Wrap the Expression in Parenthesis
When putting expressions in macros always wrap the expression
in parenthesis to avoid potential communitive operation ambiguity.
Example
#define ADD(x,y) x + y
must be written as
#define ADD(x,y) ((x) + (y))
Make Macro Names Unique
Like global variables macros can conflict with macros from
other packages.
Prepend macro names with package names.
Avoid simple and common names like MAX and MIN.
Do Not Default If Test to Non-Zero
Do not default the test for non-zero, i.e.
if (FAIL != f())
is better than
if (f())
even though FAIL may have the value 0 which C considers to be false. An
explicit test will help you out later when somebody decides that a failure
return should be -1 instead of 0. Explicit comparison should be used even if
the comparison value will never change; e.g., if (!(bufsize % sizeof(int)))
should be written instead as if ((bufsize % sizeof(int)) == 0) to reflect
the numeric (not boolean) nature of the test. A frequent trouble spot is using
strcmp to test for string equality, where the result should never ever be defaulted. The preferred approach is to define a macro
STREQ.
#define STREQ(a, b) (strcmp((a), (b)) == 0)
Or better yet use an inline method:
inline bool
StringEqual(char* a, char* b)
{
return strcmp(a, b) == 0;
}
Note, this is just an example, you should really use the standard
library string type for doing the comparison.
The non-zero test is often defaulted for predicates and other functions or
expressions which meet the following restrictions:
Returns 0 for false, nothing else.
Is named so that the meaning of (say) a true return is absolutely
obvious. Call a predicate IsValid(), not CheckValid().
The Bull of Boolean Types
Any project using source code from many sources knows the pain
of multiple conflicting boolean types. The new C++ standard
defines a native boolean type. Until all compilers support
bool, and existing code is changed to use it, we must still
deal with the cruel world.
The form of boolean most accurately matching the new standard is:
typedef int bool;
#define TRUE 1
#define FALSE 0
or
const int TRUE = 1;
const int FALSE = 0;
Note, the standard defines the names true and false
not TRUE and FALSE. The all caps versions are used to not clash
if the standard versions are available.
Even with these declarations, do not check a boolean value for equality with 1
(TRUE, YES, etc.); instead test for inequality with 0 (FALSE, NO, etc.). Most
functions are guaranteed to return 0 if false, but only non-zero if true. Thus,
if (TRUE == func()) { ...
must be written
if (FALSE != func()) { ...
Usually Avoid Embedded Assignments
There is a time and a place for embedded assignment statements. In some
constructs there is no better way to accomplish the results without making the
code bulkier and less readable.
while (EOF != (c = getchar()))
{
process the character
}
The ++ and -- operators count as assignment statements. So, for many purposes,
do functions with side effects. Using embedded assignment statements to
improve run-time performance is also possible. However, one should consider
the tradeoff between increased speed and decreased maintainability that results
when embedded assignments are used in artificial places. For example,
a = b + c;
d = a + r;
should not be replaced by
d = (a = b + c) + r;
even though the latter may save one cycle. In the long run the time difference
between the two will decrease as the optimizer gains maturity, while the
difference in ease of maintenance will increase as the human memory of what's
going on in the latter piece of code begins to fade.
Reusing Your Hard Work and the Hard Work of Others
Reuse across projects is almost impossible without
a common framework in place. Objects conform to the
services available to them. Different projects
have different service environments making object reuse
difficult.
Developing a common framework takes a lot of up front
design effort. When this effort is not made, for
whatever reasons, there are several techniques
one can use to encourage reuse:
Ask! Email a Broadcast Request to the Group
This simple technique is rarely done. For some reason
programmers feel it makes them seem less capable
if they ask others for help. This is silly! Do new
interesting work. Don't reinvent the same stuff over
and over again.
If you need a piece of code email to the group asking if
someone has already done it. The results can be surprising.
In most large groups individuals have no idea what other people
are doing. You may even find someone is looking for something
to do and will volunteer to do the code for you. There's always a
gold mine out there if people work together.
Tell! When You do Something Tell Everyone
Let other people know if you have done something reusable.
Don't be shy. And don't hide your work to protect your pride.
Once people get in the habit of sharing work everyone gets
better.
Don't be Afraid of Small Libraries
One common enemy of reuse is people not making
libraries out of their code. A reusable class may be
hiding in a program directory and will never have
the thrill of being shared because the programmer
won't factor the class or classes into a library.
One reason for this is because people don't like making small
libraries. There's something about small libraries that
doesn't feel right. Get over it. The computer doesn't care
how many libraries you have.
If you have code that can be reused and can't be placed in an
existing library then make a new library. Libraries don't stay
small for long if people are really thinking about reuse.
If you are afraid of having to update makefiles when libraries
are recomposed or added then don't include libraries in your
makefiles, include the idea of services. Base level makefiles
define services that are each composed of a set of libraries.
Higher level makefiles specify the services they want. When the
libraries for a service change only the lower level makefiles will
have to change.
Keep a Repository
Most companies have no idea what code they have. And most
programmers still don't communicate what they have done or
ask for what currently exists. The solution is to keep
a repository of what's available.
In an ideal world a programmer could go to a web page, browse
or search a list of packaged libraries, taking what they
need. If you can set up such a system where programmers
voluntarily maintain such a system, great. If you have a
librarian in charge of detecting reusability, even better.
Another approach is to automatically generate a repository
from the source code. This is done by using common
class, method, library, and subsystem headers that can double as man
pages and repository entries.
Commenting Out Large Code Blocks
Sometimes large blocks of code need to be commented out for testing.
Using #if 0
The easiest way to do this is with an #if 0 block:
void
example()
{
great looking code
#if 0
lots of code
#endif
more code
}
You can't use /**/ style comments because comments can't
contain comments and surely a large block of your code will contain
a comment, won't it?
Don't use #ifdef as someone can unknowingly trigger ifdefs from the
compiler command line.
Use Descriptive Macro Names Instead of 0
The problem with #if 0 is that even day later you or anyone
else has no idea why this code is commented out. Is it because
a feature has been dropped? Is it because it was buggy? It didn't
compile? Can it be added back? It's a mystery.
Add a short comment explaining why it is not implemented, obsolete or
temporarily disabled.
Use #if Not #ifdef
Use #if MACRO not #ifdef MACRO. Someone might write code like:
#ifdef DEBUG
temporary_debugger_break();
#endif
Someone else might compile the code with turned-of debug info like:
cc -c lurker.cpp -DDEBUG=0
Alway use #if, if you have to use the preprocessor. This works fine, and
does the right thing, even if DEBUG is not defined at all (!)
#if DEBUG
temporary_debugger_break();
#endif
If you really need to test whether a symbol is defined or not, test it
with the defined() construct, which allows you to add more things later
to the conditional without editing text that's already in the program:
It's a good idea to typedef int8, int16, int32, int64, float32, float64, uint8,
uint16, uint32, uint64, etc., instead of assuming it'll be done with int, long,
float, and short.
Alignment of Class Members
There seems to be disagreement on how to align class data
members. Be aware that different platforms have different
alignment rules and it can be an issue. Alignment may also
be an issue when using shared memory and shared libraries.
The real thing to remember when it comes to alignment is to
put the biggest data members first, and smaller members later, and to pad
with char[] so that the same structure would be used no matter whether the
compiler was in "naturally aligned" or "packed" mode.
For the Mac there's no blanket "always on four byte boundaries" rule -- rather,
the rule is "alignment is natural, but never bigger than 4 bytes, unless the
member is a double and first in the struct in which case it is 8". And that
rule was inherited from PowerOpen/AIX.
Compiler Dependent Exceptions
Using exceptions across the shared library boundary
could cause some problems if the shared library and
the client module are compiled by different compiler
vendors.
Compiler Dependent RTTI
Different compilers are not guaranteed to name types
the same.