Tuesday, August 16, 2011

Role of Meaningful Names in a software developer written code

Developers name variables, our functions, our arguments, classes, and namespaces. Developers name our
source files and the directories that contain them. Developers name our library files. Developers name and
name and name. Because we do so much of it, we’d better do it well. What follows are some simple rules
for creating good names.

Use Intention-Revealing Names
It is easy to say that names should reveal intent. Choosing good names takes time but saves more than it
takes.
So take care with your names and change them when you find better ones. Everyone who reads
your code (including you) will be happier if you do.

The name of a variable, function, or class, should answer all the big questions. It should tell you why it
exists, what it does, and how it is used. If a name requires a comment, then the name does not reveal
its intent.

int d; // elapsed time in days

The name d reveals nothing. It does not evoke a sense of elapsed time, nor of days. We should choose
a name that specifies what is being measured and the unit of that measurement:

int elapsedTimeInDays;
int daysSinceCreation;
int daysSinceModification;
int fileAgeInDays;
Choosing names that reveal intent can make it much easier to understand and change code. What is the
purpose of this code?

public List<int[]> getThem() {
List<int[]> list1 = new ArrayList<int[]>();
for (int[] x : theList)
if (x[0] == 4)
list1.add(x);
return list1;
}

Why is it hard to tell what this code is doing? There are no complex expressions. Spacing and indentation
are reasonable. There are only three variables and two constants mentioned. There aren’t even any fancy
classes or polymorphic methods, just a list of arrays (or so it seems).

The problem isn’t the simplicity of the code but the implicity of the code (to coin a phrase): the degree to
which the context is not explicit in the code itself.
The code implicitly requires that we know the answers to
questions such as:
1. What kinds of things are in theList?
2. What is the significance of the zeroth subscript of an item in theList?
3. What is the significance of the value 4?
4. How would I use the list being returned?

look at this code, this may be answer of these question:

public List<int[]> getFlaggedCells() {
List<int[]> flaggedCells = new ArrayList<int[]>();
for (int[] cell : gameBoard)
if (cell[STATUS_VALUE] == FLAGGED)
flaggedCells.add(cell);
return flaggedCells;
}

Notice that the simplicity of the code has not changed. It still has exactly the same number of operators and
constants, with exactly the same number of nesting levels. But the code has become much more explicit.

Avoid Disinformation
Programmers must avoid leaving false clues that obscure the meaning of code. They should avoid words
whose entrenched meanings vary from our intended meaning
. For example,
hp, aix, and sco would be
poor variable names
because they are the names of Unix platforms or variants. Even if you are coding a
hypotenuse and
hp looks like a good abbreviation, it could be disinformative.

Do not refer to a grouping of accounts as an accountList unless it’s actually a List. The word list means
 something specific to programmers. If the container holding the accounts is not actually a
List, it may lead
 to false conclusions
.1 So accountGroup or bunchOfAccounts or just plain accounts would be better.

Beware of using names which vary in small ways. How long does it take to spot the subtle difference between
 a
XYZControllerForEfficientHandlingOfStrings in one
module and, somewhere a little more distant,
XYZControllerForEfficientStorageOfStrings?
The words have frightfully similar shapes.

Spelling similar concepts similarly is information. Using inconsistent spellings is disinformation. With modern
languages environment we enjoy automatic code completion. We write a few characters of a name and press
some hotkey combination (if that) and are rewarded with a list of possible completions for that name. It is
very helpful if names for very similar things sort together alphabetically and if the differences are very obvious,
because the developer is likely to pick an object by name without seeing your copious comments or even the
list of methods supplied by that class.

A truly awful example of disinformative names would be the use of lower-case L or uppercase O as variable
names, especially in combination.
The problem, of course, is that they look almost entirely like the constants
one and zero, respectively.
int a = l;
if ( O == l )
a = O1;
else
l = 01;
The reader may think this a contrivance, but we have examined code where such things were abundant. In one
case the author of the code suggested using a different font so that the differences were more obvious, a
solution that would have to be passed down to all future developers as oral tradition or in a written document.
The problem is conquered with finality and without creating new work products by a simple renaming.

Make Meaningful Distinctions
Programmers create problems for themselves when they write code solely to satisfy a compiler or interpreter.
For example, because you can’t use the same name to refer to two different things in the same scope, you
might be tempted to change one name in an arbitrary way. Sometimes this is done by misspelling one, leading
to the surprising situation where correcting spelling errors leads to an inability to compile. It is not sufficient to
add number series or noise words, even though the compiler is satisfied. If names must be different, then they
should also mean something different.

Number-series naming
(a1, a2, .. aN) is the opposite of intentional naming. Such names are not
disinformative—they are noninformative; they provide no clue to the author’s intention. Consider:

public static void copyChars(char a1[], char a2[]) {
for (int i = 0; i < a1.length; i++) {
a2[i] = a1[i];
}
}
This function reads much better when source and destination are used for the argument names.
Noise words are another meaningless distinction. Imagine that you have a Product class. If you have
another  called
ProductInfo or ProductData, you have made the names different without making them
mean anything different.
Info and Data are indistinct noise words like a, an, and the.

Note that there is nothing wrong with using prefix conventions like a and the so long as they make a
meaningful distinction. For example you might use
a for all local variables and the for all function
arguments.
The problem comes in when you decide to call a variable theSource because you already have
another variable named
source.

Noise words are redundant. The word variable should never appear in a variable name. The word table
should never appear in a table name. How is NameString better than Name? Would a Name ever be a
floating point number? If so, it breaks an earlier rule about disinformation. Imagine finding one class named
Customer and another named CustomerObject. What should you understand as the distinction? Which
one will represent the best path to a customer’s payment history? There is an application we know of where
this is illustrated. we’ve changed the names to protect the guilty, but here’s the exact form of the error:

getActiveAccount();
getActiveAccounts();
getActiveAccountInfo();

How are the programmers in this project supposed to know which of these functions to call? In the absence
of specific conventions, the variable
moneyAmount is indistinguishable from money, customerInfo is
indistinguishable from
customer, accountData is indistinguishable from account, and theMessage is
indistinguishable from
message. Distinguish names in such a way that the reader knows what the differences
offer.


Use Pronounceable Names
A company I know has genymdhms (generation date, year, month, day, hour, minute, and second) so they
walked around saying “gen why emm dee aich emm ess”. I have an annoying habit of pronouncing
everything as written, so I started saying “gen-yah-muddahims.” It later was being called this by a host of
designers and analysts, and we still sounded silly. But we were in on the joke, so it was fun. Fun or not,
we were tolerating poor naming. New developers had to have the variables explained to them, and then they
spoke about it in silly made-up words instead of using proper English terms. Compare

class DtaRcrd102 {
private Date genymdhms;
private Date modymdhms;
private final String pszqint = "102";
/* ... */
};
to
class Customer {
private Date generationTimestamp;
private Date modificationTimestamp;;
private final String recordId = "102";
/* ... */
};
Intelligent conversation is now possible: “Hey, Mikey, take a look at this record! The generation timestamp
is set to tomorrow’s date! How can that be?”

Use Searchable Names
Single-letter names and numeric constants have a particular problem in that they are not easy to locate
across a body of text.
One might easily grep for MAX_CLASSES_PER_STUDENT, but the number 7 could be more troublesome.
Searches may turn up the digit as part of file names, other constant definitions, and in various expressions
where the value is used with different intent. It is even worse when a constant is a long number and someone
 might have transposed digits, thereby creating a bug while simultaneously evading the programmer’s search.
Likewise, the name e is a poor choice for any variable for which a programmer might need to search. It is the
most common letter in the English language and likely to show up in every passage of text in every program.
In this regard, longer names trump shorter names, and any searchable name trumps a constant in code.
Experts preference is that single-letter names can ONLY be used as local variables inside short methods.
The length of a name should correspond to the size of its scope
If a variable or constant might be seen or used
in multiple places in a body of code, it is imperative to give it a search-friendly name. Once again compare

for (int j=0; j<34; j++) {
s += (t[j]*4)/5;
}
to
int realDaysPerIdealDay = 4;
const int WORK_DAYS_PER_WEEK = 5;
int sum = 0;
for (int j=0; j < NUMBER_OF_TASKS; j++) {
int realTaskDays = taskEstimate[j] * realDaysPerIdealDay;
int realTaskWeeks = (realdays / WORK_DAYS_PER_WEEK);
sum += realTaskWeeks;
}

Note that sum, above, is not a particularly useful name but at least is searchable. The intentionally named code
makes for a longer function, but consider how much easier it will be to find
WORK_DAYS_PER_WEEK than to
find all the places where 5 was used and filter the list down to just the instances with the intended meaning.

Avoid Encodings
We have enough encodings to deal with without adding more to our burden. Encoding type or scope
information into names simply adds an extra burden of deciphering.
It hardly seems reasonable to require
each new employee to learn yet another encoding “language” in addition to learning the (usually considerable)
body of code that they’ll be working in. It is an unnecessary mental burden when trying to solve a problem.
Encoded names are seldom pronounceable and are easy to mis-type.

Avoid Mental Mapping
Readers shouldn’t have to mentally translate your names into other names they already know. This problem
generally arises from a choice to use neither problem domain terms nor solution domain terms.

This is a problem with single-letter variable names. Certainly a loop counter may be named i or j or k
(though never l!) if its scope is very small and no other names can conflict with it. This is because those
 single-letter names for loop counters are traditional.


However, in most other contexts a single-letter name is a poor choice; it’s just a place holder that the
reader must mentally map to the actual concept. There can be no worse reason for using the name
c
than because a and b were already taken.

One difference between a smart programmer and a professional programmer is that the professional
understands that clarity is king. Professionals use their powers for good and write code that others
can understand
.

Class Names
Classes and objects should have noun or noun phrase names like Customer, WikiPage, Account,
and
AddressParser. Avoid words like Manager, Processor, Data, or Info in the name of a
class. A class name should not be a verb.

Method Names
Methods should have verb or verb phrase names like postPayment, deletePage, or save.
Accessors, mutators, and predicates should be named for their value and prefixed with get,
set, and is according to the coding standards.

string name = employee.getName();
customer.setName("mike");
if (paycheck.isPosted())...

When constructors are overloaded, use static factory methods with names that describe the arguments.
For example,

Complex fulcrumPoint = Complex.FromRealNumber(23.0);
is generally better than

Complex fulcrumPoint = new Complex(23.0);

Consider enforcing their use by making the corresponding constructors private.

Pick One Word per Concept
Pick one word for one abstract concept and stick with it. For instance, it’s confusing to have fetch,
retrieve, and get as equivalent methods of different classes. How do you remember which
method name goes with which class? Sadly, you often have to remember which company, group,
or individual wrote the library or class in order to remember which term was used. Otherwise, you
spend an awful lot of time browsing through headers and previous code samples.

it’s confusing to have a
controller and a manager and a driver in the same code base.
What is the essential difference between a
DeviceManager and a Protocol- Controller?
Why are both not
controllers or both not managers? Are they both Drivers really? The name leads
you to expect two objects that have very different type as well as having different classes.

A consistent lexicon is a great boon to the programmers who must use your code.

Don’t Pun
Now let’s say we are writing a new class that has a method that puts its single parameter into a collection.
Should we call this method
add? It might seem consistent because we have so many other add methods,
but in this case, the semantics are different, so we should use a name like insert or append instead.
To call the new method
add would be a pun.


Note :
The  Pascal Casing and Camel Casing are used to write the code documents in these days.
Pascal Casing - First character of all words are Upper Case and other characters are lower case.
Example: BackColor
Camel Casing - First character of all words, except the first word are Upper Case and other characters are lower case.
Example: backColor

1.       Use Pascal casing for Class names

public class HelloWorld
{
                ...
}

2.       Use Pascal casing for Method names

void SayHello(string name)
{
                ...
}


3.       Use Camel casing for variables and method parameters

int totalCount = 0;
void SayHello(string name)
{
                string fullMessage = "Hello " + name;
                ...
}

4.       Use the prefix “I” with Camel Casing for interfaces ( Example: IEntity )

5.       Do not use Hungarian notation to name variables.

In earlier days most of the programmers liked it - having the data type as a prefix for the variable name and using
m_ as prefix for member variables. Eg:

string m_sName;
int nAge;

However, in .NET coding standards, this is not recommended. Usage of data type and m_ to represent member
variables should not be used. All variables should use camel casing.

Some programmers still prefer to use the prefix m_ to represent member variables, since there is no other easy way to identify a member variable.


6.       Use Meaningful, descriptive words to name variables. Do not use abbreviations.

Good:

string address
int salary

Not Good:

string nam
string addr
int sal

7.       Do not use single character variable names like i, n, s etc. Use names like index, temp

One exception in this case would be variables used for iterations in loops:

for ( int i = 0; i < count; i++ )
{
                ...
}

If the variable is used only as a counter for iteration and is not used anywhere else in the loop, many people still like
to use a single char variable (i) instead of inventing a different suitable name.

8.       Do not use underscores (_) for local variable names.

9.       All member variables must be prefixed with underscore (_) so that they can be identified from other local variables.

10.    Do not use variable names that resemble keywords.

11.    Prefix boolean variables, properties and methods with “is” or similar prefixes.

Ex: private bool _isFinished

12.    Namespace names should follow the standard pattern

<company name>.<product name>.<top level module>.<bottom level module>

13.    Use appropriate prefix for the UI elements so that you can identify them from the rest of the variables.

There are 2 different approaches recommended here.

a.       Use a common prefix ( ui_ ) for all UI elements. This will help you group all of the UI elements together
and easy to access all of them from the intellisense.

b.       Use appropriate prefix for each of the ui element. A brief list is given below. Since .NET has given several
controls, you may have to arrive at a complete list of standard prefixes for each of the controls (including
third party controls) you are using.


Control
Prefix
Label
lbl
TextBox
txt
DataGrid
dtg
Button
btn
ImageButton
imb
Hyperlink
hlk
DropDownList
ddl
ListBox
lst
DataList
dtl
Repeater
rep
Checkbox
chk
CheckBoxList
cbl
RadioButton
rdo
RadioButtonList
rbl
Image
img
Panel
pnl
PlaceHolder
phd
Table
tbl
Validators
val



14.    File name should match with class name.

For example, for the class HelloWorld, the file name should be helloworld.cs (or, helloworld.vb)

15.    Use Pascal Case for file names.


No comments :

Post a Comment