Help:Lua/Lua best practice

From Linux Web Expert

Revision as of 20:35, 6 April 2024 by imported>Shirayuki (new tvar syntax)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

As any best practice, this Lua best practice is heavily influenced by any individual contributors impression of what should be good practice. Use whatever you believe is important on your specific project, but do also know that these rules have been found to work!

The examples in the following are for Lua, but they should be easy to reformulate for JavaScript, PHP, and Python. The reason for giving examples for Lua is that modules for this language are used a lot, often by non-programmers that do not see the consequence of their choices.

As a general advice it can be a great help to import the code into a proper programming environment and check it there for obvious flaws and errors. Several free options exist, among them Eclipse, TextMate, and Visual Studio Code just to mention a few.

Especially, use a lint tool like lualint or luacheck. The later is somewhat better. Check also if there are tools for mess detection or other analysis available for your programming environment.[1][2]

Remember; consistent style is more important than the "right" style – even if that style is the one you prefer![3]

Names

Descriptive names
Chose descriptive names, avoid overly generic names like get() unless your code is truly generic. Don't repeat a description on several levels, like a method name set to the same as a class name, unless the method in fact return a class of that type. Think colorful words, more descriptive for larger scopes, but throw away unneeded words.[4]
Short names
Cryptic short names should be avoided. Spell out the names, but stay concise. Don't use tmp_t, say instead temporaryTable or temporary_table. Often you can reformulate the short names and still keep them short, like entities and categories. Use a plural name for tables holding several values for example.[5]
Iterator names
Iterators usually have short names in Lua, and the names carry special meaning. This goes against short names and can create problems if several nested loops shall use the same names. A common solution is to prefix or suffix an additional part to the name, usually from the plural form used by the collection, or to use a singular form of the name of the collection.[6]
Names with types
Lua has a rather weak type system, even if it looks like it is strong on some parts. Arguments should either be coerced to the correct type, or the function should throw an error. If the function exists in special incarnations to handle different types, then that should be part of the function name.[7]
Acronyms as names
Acronyms are intimidating for those that does not know their meaning. Do not use them unless it the simplest and most obvious cases. In a library for creating HTML markup td() might be acceptable, but in a library for creating infoboxes the same may not be acceptable.[8]
Format names
Formatting of names should follow the convention on the actual project. Lua as such uses several formatting styles, but in Wikimedia projects it seems to be UpperCamelCase for class names, and lowerCamelCase for function, method and variable names. Module names seem to be interpreted as variable names. There are no clear practice for constants, but UPPER_CASE_EMBEDDED_UNDERSCORE seems like a wise bet.[9]
Single underscore
In Lua code a single underscore name is used wherever a name must be provided, but it is not used. That is it is a placeholder that is otherwise ignored. This is recognized by some lint tools, and should also be respected in ordinary code.[10][11]
Leading underscore
By convention a leading underscore is used for names that are somewhat private but of some reason must be placed so it is accessible from the outside. Variables with such names should be treated as private to whatever defined them, unless they have some very clear explanation that say you can do some specific operations on them.[12][13]

Formatting

Limit line length
Lines should be limited to a reasonable length, typically around 80 characters. The reason is that long lines are hard to read, and as they grow longer they tend to contain more intermingled concepts, and editing them will tend to create new bugs.[14]
Indentation of newlines
Programmers have opinions on how to do indentation of newlines. Whatever opinion you have, do remember that crowdsourced projects should adhere to some common standards. Note that while a tab looks like the same as 4 spaces in the online editor, it does not show up like this in the normal view. Because the online editor do indentation in specific ways you should set your programming environment to act the same way.[15][16]
Binaries and line breaks
Sometimes it is necessary to put a line break inside an expression, typically an if-clause. This pose a special problem, as it opens up for copy-paste induced errors. To avoid problems a simple formatting tricks can be used, simply put the logical binary operator in front of the expression on a new line.[17]
Example 
Good Bad
if long_statement_A
    and long_statement_B
    and long_statement_C
then
    
if long_statement_A and
    long_statement_B and
    long_statement_C then
    
Whitespace
Whitespace is usually a GoodThing™, but people tend to disagree on how to use it. Done right it tends to increase readability of your code, but done wrong your code can be really hard to read. Find a style guide and stick to it. Or use a tool.[18]

Documentation

Interface documentation
The interface is the exposed functions from the module, that is the functions you can invoke from the parser function. As a minimum document whatever arguments the interface function takes, and whatever value it returns. The parser function will only use a single returned value, no matter what the function returns. Remember to document if and how the metafunction __tostring will transform the return value.
Overview documentation
The documentation should have an overview to give a general explanation of why and how the module, function, or variable is the way it is. This is not just the intended behaviour, it is why the behaviour is the way it is.
Parameter annotation
Each parameter should be properly annotated. That is the expected type should be given and a short descriptive string. Write as if LDoc or a similar tool was available, usually as -- @param type optional string.
Return annotation
Each return value should be properly annotated. That is the expected type should be given and a short descriptive string. Write as if LDoc or a similar tool was available, usually as -- @return type optional string.

Comments

Inline comments for the code itself.

Why-form of comments
Comments are not only for you as the person who coded the program, they are also for those who read your code. In the code path there should be answers to the how-questions, and in the comments there should be answers to the why-questions. Do not fall in the trap of replicating what you do in the code path in the comments, good comments provide new insight into the code.[19]
State your intent
State the intent of your code, in a clear and precise language. Do not just add a lot of words, think about the reader, what do they need to know to understand your code.[20]
Refactor on comments
When the code has become complex, it is common to start explaining the code in order to help yourself. If you need your own comments, then something has gotten too involved and you should refactor the code.[21]
Comments on bad names
A comment saying that a name is bad in some respect does not help. Do not comment on bad names, fix them![22]
"Todo" comments
The marker todo is one of several common ones, and can be interpreted as "code the main programmer has not gotten around to fix yet". Usually you should write it as -- @todo optional string.[23]
"Fixme" comments
The marker fixme is one of several common ones, and can be interpreted as "code someone other than main programmer has identified as broken". Usually you should write it as -- @fixme optional string. This is an invitation to others to try to fix the code.[24]
"Hack" comments
The marker hack is one of several common ones, and can be interpreted as "code someone other than main programmer has identified as broken and tried fixing in an unelegant way". Usually you should write it as -- @hack optional string. This is an invitation to others to try to fix the code.[25]
"XXX" comments
The marker xxx is one of several common ones, and can be interpreted as "code identified as dangerously broken". Usually you should write it as -- @xxx optional string. This marker should not be in production code, the code should be fixed asap.[26]
Comments on magic numbers
Constants and other magic numbers should have comments explaining why they have the specific content. Why is the line length set to 80 chars? Why is π truncated to 3.14?[27]


Code

Small patterns and coding practices.

Software patterns
A lot of work has gone into identifying common software design patterns, and developing good general solutions to those patterns. If you suspect you are attempting to do something that has a common pattern then do a search and check out how to implement the pattern. Still, know that patterns for languages with closures can be very different from those without closures. Usually tables are used in Lua to create objects, but they can also be created as closures.[28][29][30]
String quotes
In Lua there are several types of string quotes. Pick one of them and stick with it. If you need several types of quotes, either use your typical quote at the outermost level or at the innermost, and work your way out. Inside out with single quote as the primary quote and double quotes as secondary seems to be common, but could simply be accidental. If more quotes are necessary, then start using double square brackets at the ternary level.[31][32]
Return early
Return as soon as possible, especially if something fails. Usually programmers are told to stick with a single return point, but due to the language constructs in Lua this seems to create deeply nested code. It is thus better to return early.[33]
Example 
Good Bad

if done then
    return done
end


if not done then
    
end

return done
Truthy expressions first
Usually you are free to order clauses in then and else as you see fit, and you should order the clauses so the if-clause has a positive test. This makes the code easier to read. If you can avoid the else-clause that is although more important.[34]
Example 
Good Bad
if a == b then
    
else
    
end
if a ~= b then
    
else
    
end
Interrogated first
Usually the interrogated value goes first in a logical expression, as this is more natural to read and thus less error prone. The inverted form is sometimes called the Yoda form. Sometimes (but rarely) the natural form does not work as expected, and the code terminates. This can be because Lua fails figure out a proper tail recursive form of the expression, and then the stack overflows.[35]
Example 
Good Bad
if length > 10 then
    
if 10 <= length then
    
Provide defaults
In languages with an or operator like in Lua, it is very easy to provide defaults. This is nice as some common constructs fails badly if given an uninitialized value. This happen for example when a table is uninitialized and the code tries to index the nil value.[36]
Example 
Good Bad

local t = arg or {}

t['foo']

local t = arg

if t then
    t['foo']
else
    
end
Binary conditional
Lua does not have a binary conditional operator (??), or “nullish coalescing operator”, but you can do the same with an or operator. What goes on is although a little mysterious to newcomers to Lua. The logical operators and and or pass on truthy values (materialized values), and thus they can be part of other expressions than just logical ones.[37]
Example 
Good Bad
local area = (length or 0) * (width or 0)
local area = 0
if length and width then
    area = length * width
end
Ternary conditional
Lua does not have a ternary conditional operator (?:), but you can do the same with an and and an or operator. What goes on is although a little mysterious to newcomers to Lua. The logical operators and and or pass on truthy values (materialized values), and thus they can be part of other expressions than just logical ones.[38]
Example 
Good Bad
local height = pressure and millibarToMeters(length) or 0
local height = 0
if pressure then
    height = millibarToMeters(pressure)
end
Note that the form a and b or c where b is falsy will give c as outcome. That may not be the expected behavior.
Avoid repeat…until
The loop construct repeat…until delays the test until after the loop. This is error prone and should be avoided. The only times this is acceptable is when the test itself is costly compared to the executed block, or avoiding the repeat-clause would imply additional cleanup or finalize code.[39]
Test arguments
There is a small utility libraryUtil to test arguments types, and it should be used to avoid simple mistakes. Often the arguments are valid when other code is correct, but in fringe cases something creates a call outside the expected type range and the code fails in mysterious ways. Proper testing might catch an error early, and increase the probability of discovering the real root cause. Still note that type testing is a test for failure, and not a test for correctness.
Example 
Good Bad
local util = require 'libraryUtil'
local checkType = util.checkType

function concat( a, b )
    checkType( 'foo', 1, a, 'string' )
    checkType( 'foo', 1, b, 'number' )
    return a..'='..tostring(b)
end
function foo( a, b )
    return a..'='..tostring(b)
end


Patterns

A few very common software patterns.

Command pattern
The single most common software pattern in Lua modules, that is most commonly missing, is the command pattern. Some data is available, and some action should be chosen given this data. A typical smell that a command pattern is missing is long chains of if-then-elseif-end tests. The most common variant does not have named commands, but an accept-qualifier function. Both versions are very common in parsing arguments from the "invoke" parser function.
Example 
local commands = {}
-- each member of the command table is a simplified "ConcreteCommand"
commands['foo'] = function( val ) return -math.sqrt(-val) end
commands['bar'] = function( val ) return 0 end
commands['baz'] = function( val ) return math.sqrt(val) end

function sqrt( command, value ) -- execute for the "Command"
    return commands[command]( value ) -- execute for the "ConcreteCommand"
end

sqrt( -4 ) --> -2 (Should really be a complex number!)
local commands = {}
-- each member of the command table is a simplified "ConcreteCommand"
table.insert( commands, { function( val ) return value < 0 end, function( val ) return -math.sqrt(-val) end } )
table.insert( commands, { function( val ) return value == 0 end, function( val ) return 0 end } )
table.insert( commands, { function( val ) return value > 0 end, function( val ) return math.sqrt(val) end } )

function sumSqrt( value ) -- execute for the "Command"
    local accumulator = 0
    for _,command in ipairs( commands ) do
        command[1]( value ) then -- accept for the "ConcreteCommand"
            accumulator = accumulator + command[2]( value ) -- execute for the "ConcreteCommand"
        end
    end

end

sumSqrt( -4 ) --> -2 (Should really be a complex number!)

Testability

About testing and how to make the code testable. Note that the original book from the gang of four has software patterns for strongly typed languages, which does not fit a weakly typed language very well.[40] Whether Lua is strongly typed, or weakly, is up for debate.[41]

Complexity
There are usually several execute paths through a given code, and this makes the code difficult to understand. The more code paths the more difficult the code become. This can be measured as cyclomatic complexity, and is a count of paths by counting branch points. This gives a fair, but not very good measure on code quality.[42]
Avoid globals
Functions with globals are hard to test, no matter how the globals are included in the code. To avoid this accidental trap libraries can be included to wet the module, like require('Module:No globals'). Still note that requiring some code is like importing a global, so be careful not to create a bigger problem. [43][44][45]
Example 
Good Bad
local t = {}
function f( arg, opts )
    do_something( arg, opts )
end
f( 'somevalue', t )
local t = {}
function f( arg )
    do_something( arg, t )
end
f( 'somevalue' )
Avoid changing scope
In Lua it is possible to check the scope of a function. Don't do that, as the function will then depend on "mini-globals", which is nearly as bad as globals. If everything works as it should, then the mini-globals are probably semi-constant, but there are no such guarantee.

Security

Lua runs in a kind of sandboxed environment, you may say that the Lua code executes in a dmz on the outside of the usual server, where only a few calls pass through the wall and those should be properly validated by the internal code. In general there should be no reason to take any special precautions to make the code secure, as the result from the code execution will pass through proper cleaning and escaping before being included on the page.

See also

Programming environment

Some advices for how to set up your external programming environment.

Visual Studio Code

Use something like the following extensions.

Other tools

This is tools you might find useful.

References

  1. LuaUsers: LuaLint
  2. GitHub: mpeterv/luacheck
  3. The Art of Readable Code p. 21,34,43.
  4. The Art of Readable Code pp. 8-11, Choose Specific Words.
  5. The Art of Readable Code pp. 18-19, How Long Should a Name Be?
  6. The Art of Readable Code p. 12, Loop iterators.
  7. The Art of Readable Code pp. 15-17, Attaching Extra Information to a Name.
  8. The Art of Readable Code pp. 19-20, Acronyms and Abbreviations.
  9. The Art of Readable Code pp. 20-21, Use Name Formatting to Convey Meaning.
  10. LuaUsers:LuaStyleGuide
  11. Programming in Lua: 1.3 – Some Lexical Conventions
  12. LuaUsers:LuaStyleGuide
  13. Programming in Lua: 1.3 – Some Lexical Conventions
  14. PEP-8: Maximum Line Length (This style guide use 79 chars.)
  15. PEP-8: Indentation
  16. PEP-8: Tabs or spaces
  17. PEP-8: Should a line break before or after a binary operator?
  18. PEP-8: Whitespace in Expressions and Statements
  19. The Art of Readable Code p. 56. The box has an alternate approach to what-why-how.
  20. The Art of Readable Code pp. 62-63, State the Intent of Your Code.
  21. Refactoring Guru: CodeSmells: Disposables: Comments
  22. The Art of Readable Code p. 49, Don’t Comment Bad Names—Fix the Names Instead.
  23. The Art of Readable Code p. 50, Comment the Flaws in Your Code.
  24. The Art of Readable Code p. 50, Comment the Flaws in Your Code.
  25. The Art of Readable Code p. 50, Comment the Flaws in Your Code.
  26. The Art of Readable Code p. 50, Comment the Flaws in Your Code.
  27. The Art of Readable Code p. 51, Comment on Your Constants.
  28. JavaScript Patterns. p. 2
  29. Programming in Lua: 6.1 – Closures
  30. LuaUsers: Minimising Closures
  31. PEP-8: String Quotes
  32. LuaUsers: StringsTutorial
  33. The Art of Readable Code pp. 75-79, Returning Early from a Function.
  34. The Art of Readable Code pp. 70-73, The Order of Arguments in Conditionals.
  35. The Art of Readable Code pp. 70-73, The Order of Arguments in Conditionals.
  36. Programming in Lua: 3.3 – Logical Operators
  37. LuaUsers: TernaryOperator (This is for ternary conditional, but the binary conditional is pretty similar.)
  38. LuaUsers: TernaryOperator
  39. The Art of Readable Code pp. 74-75, Avoid do/while Loops.
  40. JavaScript Patterns. p. 2
  41. Lua for Python Programmers: Types
  42. Quandary Peak Research: Measuring Software Maintainability
  43. C2Wiki: Global Variables Are Bad
  44. PlayControl Software: Using closures in Lua to avoid global variables for callbacks
  45. JavaScript Patterns, pp. 12-13

Reading