Friday, April 10, 2020

Best Practices for Coding

 | 
In my last article i was stressing the importance of testing code with realistic data volumes in order to detect potential performance issues when the code is actually deployed. that got me to thinking about something that i don't normally think about. the quality of the code we write. now you are probably wondering what on earth i mean by the "quality" of the code that we write? after all, code is either good (it works, does what it is supposed to and doesn't crash) or bad (it doesn't do what it is supposed to, or crashes). however, that addresses only the functionality of the code, not its quality.

one of vfp's greatest features is that there are almost always several different ways of doing the same thing. unfortunately, this is also one of the worst things about the language because, if there were only one way of doing something, you would have no choice. what i notice, whenever i am working on code (my own or someone else’s) is how easily we developers fall into patterns of doing things. once we have figured out the way to tackle some particular type of operation we tend to stick to it unquestioningly. i believe there are three critical issues that affect the quality of our code:

maintainability    ask yourself this question. could someone (assuming they are a technically competent developer) take over your maintaining and enhancing your code? is there (current!) design and implementation documentation explaining what the code is supposed to do and how it actually does it? is the code itself commented so that it is clear what is going on at every point?
efficiency           code can easily be functionally correct, but perform sub-optimally. the vacation calculator that took 45 minutes to run (that i referred in my last article) is a perfect example of code that worked but was not optimal. as noted, determining whether the code is optimal is largely a matter of proper testing at each level, but that is detection after the fact. what can you do to avoid writing sub-optimal to start with?
best practices    this is probably the hardest one of all. what are the "best practices' for vfp code? who decides what they are, and where are they defined? the only answer to these questions is that i don't know.
therein lies the problem; there are no absolute rules that can be applied to all of these issues, in all cases. as with so many things it is a largely a matter of personal judgment and preference. what i can do is to offer some suggestions, based on my 20+ years of working with foxpro in its various incarnations (if there are mistakes to be made, you can bet that i've probably made them all at one time or another). so here goes with my top ten rules for writing better code …

rule #1: never copy and paste code     since every piece of code is different - in either intent or implementation (or both), there can never be a situation where you should copy and paste code that was written elsewhere. if you do find yourself copying and pasting then it means that you are in one of two situations. first you have an operation which closely mimics an existing operation. in this case you should probably be designing a more generic solution to handle both operations rather than trying to hack existing code. or second, you are simply duplicating functionality that already exists elsewhere because, in its current form or location, it cannot be accessed directly. in this case the code should not be duplicated but should be extracted and made available as either a method or procedure to everything that needs to use it.

rule #2: don't adopt global solutions to solve local problems there are basically two types of environmental control commands in vfp; those that are scoped to the datasession (e.g. set exclusive and set deleted) and those that are global to vfp itself (e.g. set escape and set nulldisplay). in either case these commands are actually global – they affect everything within their scope and so should only be used when you really need to change the working environment as a whole. consider the following code (from a real application i may add) whose intent was to un-delete records previously flagged for deletion:

procedure <name>
use <table> order <tagname>
set deleted off
set filter to deleted()
***
***** functional code here
***
set deleted on
return

now, what's wrong here. first, there is absolutely no error checking at all and the use command is poorly implemented, at the very least an "in 0" should have been used to avoid inadvertently closing an already open table. however, the bad part is the explicit use of the global set commands. irrespective of whatever else is going on in the application, all tables in the current datasession will be running with deleted = on after this procedure is called. while that is fine when it is the intended behavior, it shouldn't happen by chance. at the very least, the setting should have been checked on entry to the procedure and restore to whatever state was extant at the end. but the real issue is why even use a global command at all? if you want to un-delete records, why not just use recall for <condition>? it doesn't matter to the recall command where deleted is set on or off and there is no need to mess with a global setting at all.

rule #3: ensure that return values are meaningful      it is often said that the difference between a procedure and a function is that a function returns a value. however, in vfp this is not actually true because the only difference between a procedure and a function is how the code is called, and whether parameters are passed by default by reference (in a procedure call) or by value (in a function call). what this means, therefore, is that all vfp procedures, functions and methods must always return a value to the calling code (if nothing else is specified the value is simply a logical .t.). any code that might be interpreted as a function must, therefore, return a meaningful value. here is an example of code that, in the application, is called using the old "=" syntax (i.e. =opendbf('table_name'))

****************************************
function opendbf
parameter tablename, ordertag
****************************************
if not used(tablename)
  select 0
  use (tablename)
endif
select (tablename)
if parameters() = 2 and not empty(alltrim(ordertag))
  set order to tag &ordertag
endif

notice that there is no explicit return value so this "function" will always return "true" – even after an error. what is needed here is some local error trapping (try…catch was introduced into the language for precisely this kind of situation) and a proper return value that informs the calling code whether the function achieved its objective, or not, so that the calling code can take the appropriate action.

this leads directly to rules 4 and 5:

rule #4: always include an explicit return statement            the reason for this rule is that if you omit the return statement, and the last line of the procedure is a function call, there is no way to see the result in the debugger. for example, when debugging the procedure code above unless you pass the second (ordertag) parameter the last line of code that you can step into is the "select (tablename)" statement. by including the return command you provide a stopping point for the debugger so that you can evaluate the results of the last block of code too!

rule #5: always check the return value from a function call     possibly the worst example of violating this rule comes from the vfp help file itself. in the tableupdate() topic is the following "example":

replace clastname with 'jones'
? 'new clastname value: '
?? clastname  && displays new clastname value (jones).
= tableupdate(.t.)  && commits changes.
? 'updated clastname value: '
?? clastname  && displays current clastname value (jones).

this example has caused more developers more grief and prompted more questions on technical forums than any other single item in the help file. typically you see something like this:

bug: tableupdate not working!

my save button calls tableupdate and there is no error, but when i look in the table the changes aren't there. what's wrong?

the answer is almost always that the developer based their code on the help file example! but as stated, quite clearly in the help topic (under 'return value'), tableupdate() like all vfp functions returns a value. if the update succeeds, the return value is true; if it fails, no error is generated but the return value is false and the windows error structure is populated. the vfp aerror() function can be used to determine the actual error message. so the correct way to use the tableupdate() function is:

replace clastname with 'jones'
? 'new clastname value: '
?? clastname  && displays new clastname value (jones).
if not tableupdate(.t.)  && attempt to commit changes
  aerror( laerr )
  messagebox( laerr[2], 16, 'update failed' )
else
  ? 'updated clastname value: '
  ?? clastname  && displays current clastname value (jones).
endif

any time i see code that uses "=functionname()" i am immediately suspicious. what this is doing is saying that the result of the function call is irrelevant, and while that may be true in certain cases, it is certainly not the norm and is certainly not good practice.

rule #6: always check, and validate, input parameters this may seem obvious but, as shown in the opendbf() function above this fundamental rule is not always followed. my own preference is to use code like this to validate parameters and while there are many different ways of handling the issue, handled it must be!

function addconnection( tuconname )
  *** check the input parameter
  if vartype(tuconname) <> "o"
    *** we didn't get a connection object
    if vartype( tuconname ) = "c" and ! empty( tuconname )
      locondets = .getconnection( tuconname  )
      if not locondets.lstatus
        *** could not find this connection so just return
        return .makeresobj( .f. )
      endif
    else
      *** invalid or no connection name passed
      this.logerror( 9014, tuconname, lower( program()))
      return .makeresobj( .f. )
    endif
  else
    if lower( tuconname.class ) <> "xconnection"
      *** invalid or connection object passed
      this.logerror( 9014, tuconname, lower( program()))
      return .makeresobj( .f. )
    endif
    *** if we get to here we have a valid connection object
  endif

as you can see, this code handles the case where an object of a specific type is expected, but where the name of an object may be passed instead of the reference. either case is catered for and, while the code could be condensed it is both readable and unambiguous in this format. which leads me directly to rule 7.

rule #7: don't use 'cute', but obscure, coding  i, like most people, like others to think of me as being "clever" and i regard that as a natural enough desire. however, there is no excuse for projecting that desire into code so that it becomes obscure and difficult to understand. consider the following piece of code:

do case
  case ! checkparameters( tcname, tlstatus, tdlastdate )
    lcerror = "invalid parameters:
  case ! updatename( tcname )
    lcerror = "cannot update the specified name value"
  case isdateoutofrange( tdlastdate )
    lcerror = "date is outside the allowable range"
  case ! setstatus( tlstatus )
    lcerror = "cannot update the status flag"
  otherwise
    lcerror = ""
endcase
if not empty( lcerror )
  messagebox( lcerror, 16, 'error occurred' )
endif

what this code is doing is simple enough – it is using the case construct to call each function in turn because, in each case, the function is expected to return a result that means that the case condition evaluates as false. this forces the next case to be evaluated, and so on. however, this is not immediately obvious, and is certainly not a  normal use of the do case construct. especially in the absence of any comments it would very be hard to pick this out of a program and understand it without considerable effort. similarly code like this little function, also cute, is hardly readable, let alone maintainable:

lparameters tnday
local lcday
*** convert day number to the day of the week
lcday = iif( tnday = 1, 'sunday', iif( tnday = 2, 'monday', iif( tnday = 3, 'tuesday', iif( tnday = 4, ;
'wednesday', iif( tnday = 5, 'thursday', iif( tnday = 6, 'friday', iif( tnday = 7, 'saturday', ;
'invalid day number')))))))
return m.lcday

and here's another example – which would you prefer to run across in code you have to debug?

mimg = iif(file(sys(5)+sys(2003)+"\img\"+ m.ic +".jpg",sys(5)+sys(2003);
+"\img\"+ m.ic + ".jpg",sys(5)+sys(2003)+'\img\noimg.jpg')

or

if file(sys(5)+sys(2003)+"\img\"+ m.ic +".jpg")
  m.img = sys(5)+sys(2003)+"\img\"+ m.ic + ".jpg"
else
  m.img = sys(5)+sys(2003)+'\img\noimg.jpg'
endif

in each case the result is the same, the specified image file is checked for, and if found assigned to the variable, otherwise a default image is assigned instead. but, in my opinion clarity ( and hence maintainability) is more important than merely saving a couple of lines of code.

rule #8: don't create unnecessary functions    what on earth is an "unnecessary" function? simply it is a function whose only functionality is to call other functions. this example is from some code that someone sent to me recently:

if not isallupper( string )
  string = upper( string )
endif

what, i wondered, was the "isallupper()" function call for. it's not a vfp function so it had to be a udf and, opening the code, this is what i found:

*************************************************
function isallupper
*************************************************
parameters string
if upper(string) = string
  return .t.
else
  return .f.
endif

this is totally absurd since there is no conditional logic here at all. the result is to always force the string to upper case, so the only code needed is:

string = upper(string)

here's another example:

function cityline
lparameters lccity, lcstate, lczipcode
f = alltrim( lccity ) + ", " + alltrim( lcstate ) + " " + alltrim( lczipcode )
return f

since all this is doing is concatenating the three input parameters, the code could just as easily concatenate them directly. admittedly if this is done often and there is a possibility it might have to change, then there is some merit to the function (it concentrates the code in one place), or if there was some other variable that applied some other logic (country-specific formatting for example). however, as it stands it is clearly an unnecessary function and achieves nothing more than to introduce additional overhead into the program.
rule #9: don't use magic numbers       a 'magic' number is simply some undefined value which is used to interpret data in some way. again, from a real application i got the following piece of code:

if m_userlevel < 20
  if m_userid = m_owner
    m_userlevel = 11
  else
    m_userlevel = 12
  endif
endif

that is all that there was! now, by interpreting what the code was actually doing i was eventually able to figure out that userlevel 11 was "executive" and userlevel 12 was "other manager". but there were no other numbers used (anywhere in the code) and there was no definition – either in code, an include file or in a table – of what these numbers meant. if you need values like this they must be defined in a table somewhere. that immediately conveys four benefits. first, you won’t forget what they mean! second if you need to change the descrptions for these values, then you don’t need to change any code, you simply change the values in the table. third, you have a consistent set of definitions which allow you to display meaningful text without having to resort code like this:

do case
  case lnuserlevel = 11
    lcdisplayname = ‘responsible executive’
  case lnuserlevel = 12
    lcdisplayname = ‘manager’
  otherwise
    lcdisplayname = ‘unknown user’
endcase 

and finally you have a way to interpret the values for those situations when you don't have the ability to write the code directly (like in a report, or an extract file for example).

rule #10: name methods and functions appropriately this is another 'obvious' one! however, yet again, it is easy to find examples that show that people don't  do it. here is an example:

if ischange( m.username )
  replace username with m.username in loginusers
endif

now what would you expect this code to be doing? when i first saw it i assumed that it meant that if the user name has been changed, then update the "loginusers" table with the new user name. however, in the context of the application that didn't seem to make sense, so i went looking. here is the code called by ischanged():

procedure ischange
parameters string
return change2lower(string)

procedure change2lower
parameters string
string = alltrim(string)
retstring = ""
do while not empty(string)
  * raise first char as upper case
  string = upper(substr(string, 1, 1)) ;
         + iif(len(string) > 1, substr(string, 2), "")
  retstring = retstring + getfirstword(string) + " "
  string = removefirstword(string)
enddo
return retstring

if you examine this you will see that what it does is actually to capitalize the first letter of each word in the input string and put the rest into lower case. so not only are these examples of unnecessary functions - the actual code required is just

replace username with proper( m.username ) in loginusers

but the names used are totally inappropriate. since all ischange() does is to call change2lower() why is it named "ischange"? moreover, change2lower() does not change a string to lower case, it changes it to proper case.

how about this one, a program file named "productn.prg"? what does it do? well, here's the only comment from the code:

* productn - check for duplicates

hardly what you would call a descriptive name for the program!

there are lots more things that could be included, but these are my current list of the 'top 10' rules for writing code:

rule #1: never copy and paste code
rule #2: don't adopt global solutions to solve local problems
rule #3: ensure that return values are meaningful
rule #4: always include an explicit return statement
rule #5: always check the return value from a function call
rule #6: always check, and validate, input parameters
rule #7: don't use 'cute', but obscure, coding
rule #8: don't create unnecessary functions
rule #9: don't use magic numbers
rule #10: name methods and functions appropriately

I'd be interested ot hear from anyone with their list....

No comments:

Post a Comment

Writing better code (Part 1)

Writing better code (Part 1) As we all know, Visual FoxPro provides an extremely rich and varied development environment but sometimes to...