Multiple methods with confusably-similar names and effects
A class or namespace has two functions with similar names. A programmer who wants the functionality of the first function may mistakenly call the second. If the effects are similar, casual testing may mislead the programmer into thinking they had called the correct function, even if the two functions differ in some important way.
The classic example of this is the confusion between Thread.start() and Thread.run() in Java. Python's standard thread package was based on Java's, and also has this problem. For example, consider these snippets:
Thread myThread = new Thread(() => doSomethingExpensive());myThread.run();
class MyThread(thread):def run(self):doSomethingExpensive()myThread = MyThread()myThread.run()
In both of these, the programmer intended to call myThread.start(), which creates a new background thread and then runs doSomethingExpensive on that background thread. Instead, they have called myThread.run(), which runs doSomethingExpensive() in the current thread. This happened because start and run are confusingly similar names. Further, the application will appear to work, but it will be slower because something which should be done in the background is blocking important behavior. Because of that, this bug can go undetected in a codebase for a long time.
For another example, consider the battle between the various load functions in PyYAML. Originally, in versions 3.12 and below, PyYAML had two functions called load and safe_load, where safe_load had a safe behavior while load could execute arbitrary code. In PyYAML 4.1, they renamed the old load to danger_load. They removed the old safe_load function and created a new load function, which is also unsafe. This caused significant controversy. The story is chronicled in this blog post.
Both pairs of functions raised issues of confusability. Users of a function called load will see YAML parse correctly and believe they have implemented this functionality correctly, but they have in fact introduced an arbitrary-code-execution vulnerability into their software. And when there is a function called danger_load, a programmer may be tempted to think that the load function implements the safer options, but in this library load was in some ways more dangerous than danger_load.
Discussion and Lessons
Having confusable methods is one way to violate the Representable/Valid Principle, that there should be a 1-1 mapping between representable and valid states of the program. For the thread example, the state where doSomethingExpensive() has been run in the main thread is an error state which should never occur. It should not be possible to call the Thread.run() method. In PyYAML 3.12, calling load enters a state where arbitrary code may have been executed, which is invalid; it should therefore not be possible for a programmer to inadvertently call it. In comparison, after the changes of version 4.1, an ordinary programmer would only call load and never danger_load, meaning the result of having called danger_load is not representable in any program surviving minimal code review. However, this only makes uses of the load function more likely to pass code review, even though it is also insecure.