The Software Crisis
In Germany, October 1968 the NATO Science Committee held a conference. This was the first international conference on software engineering (a then controversial term). Some 52 experts from around the globe were brought together for four days to discuss a growing and uncomfortable problem: nothing ever seemed to go according to plan.
In 1972, Edsger W. Dijkstra (who had attended the conference) described this as a "Software Crisis". He outlined how explosive growth of computing power and novel languages like ALGOL and FORTRAN created opportunities the newly forged programming community were ill-equipped to exploit. Despite best efforts:
- projects ran over-budget,
- projects ran over-time,
- projects were of poor quality,
- projects failed to meet requirements,
- and in some cases, projects were never delivered.
As of writing, some 45 years later, these issues are still just as relevant and familiar, though perhaps today we would term these as a "Software Normality". While there have been improvements in all aspects of The Software Crisis, the problem is far from solved, and maybe will never be solved.
Here are some recent projects that demonstrate The Software Crisis in action:
- SIREN (Surrey Integrated Reporting Enterprise Network). An admin system for the Police of Surrey, United Kingdom which did not meet requirements and was never completed. It cost £14.8 million, abandoned in 2013. (https://www.bbc.com/news/uk-27911416)
- The Queensland Health Payroll System. A project funded by the Australian State Government of Queensland came in over 200 times the expected budget of $6 million (AUD), to eventually cost $1.6 billion in 2013. It was eventually made fit for purpose after failing to pay workers correctly (or at all) for over a month. (https://www.smh.com.au/technology/queensland-health-payroll-fail-government-ordered-to-pay-ibm-costs-20160404-gnxpqj.html).
- Expeditionary Combat Support System. A United States air force project that was abandoned after the entire budget of $1.1 billion and 10 years of development was wasted. According to one spokesperson, "[They] estimate it would require an additional $1.1 billion for about a quarter of the original scope to [be achieved]..." (https://www.computerworld.com/article/2493041/air-force-scraps-massive-erp-project-after-racking-up--1b-in-costs.html).
Why Are Projects Over-Budget?
Projects are over-budget because quality and deadlines are easier to achieve when given more money, whilst staunchly enforcing a budget could produce a product that is not fit for use or not delivered before a critical deadline, hence wasting the capital. Understanding why projects are incorrect and/or over-time (developer time usually being the largest cost in development) is key to understanding why projects are over-budget.
Why Are Projects Over-Time?
When a developer estimates the time to complete a feature, it is common practice to double or triple their estimate. This pessimistic and pride-offending practice is proven necessary time and time again. Developers consistently encounter problems that reveal themselves during development but aren't obvious from the outset. Logic would suggest that the developer would factor this phenomenon into their estimates as they mature, but this doesn't generally happen.
An experiment run in 2011 by Chance et. al. may hold the answer to why even experienced programmers quote poorly. In the experiment:
- Participants were given a set of difficult questions with answers provided.
- They were instructed to complete the questions themselves.
- They could check their own work and correct mistakes if they wanted to.
Unsurprisingly, the participants received high marks.
The same participants were given a new test without the answers and asked to estimate their accuracy afterwards.
They were appallingly optimistic compared to the control group. They focused on the good memories of doing well, and failed to heed the bad memories of correcting their own answers. Or to put it another way, they ignored the hard-evidence of their poor performance from their own marking and corrections, in favour of believing in their own intelligence. The self-deception is powerful, immediate and, further tests concluded, long lasting.
Programmers rate their future performance are under the same influences. The successes of the past cloud objectivity when assessing complexity and ability. As more time passes and a developer become more experienced and validated, the ego of the developer becomes more of an obstacle to estimating accurately. In addition, an experienced developer is more likely to be involved in making decisions around planning and budgeting.
When a child of 3 years is given box of smarties, opens it and find it contains pencils, they're naturally disappointed. But fascinatingly, if they're asked to imagine what some other child might think is in the box, they will say "pencils". Even when asked what they themselves thought was in the box before it was opened, they will still say "pencils". Children so young assume that what they know now, they always knew, and everyone else already knows. ("Theory of mind - Smarties task and Sally-Anne Task")
As adults, experienced developers have a more full "theory of mind", but we still tend to use our own mental processes as the template by which we predict other's behaviour, and even neglect to consider the lack of knowledge in others for sufficiently complex tasks (Keysar, et.al 2003). An experienced developer may lean towards the assumption that a less experienced developer could create something in the way they would, or in the time they would, or would make as many mistakes as they would. This is disastrous to allocating time and budget.
Even worse developers even paradoxically overestimate their own abilities whilst assuming less experienced developers have similar abilities to themselves. A study specifically into Software Developers interviewed two software companies and found that 32% of developers at one, and 42% at the other rated themselves as in the top 5% of developers at the company (https://www.youtube.com/watch?v=pOLmD_WVY-E).
Why Do Projects Fail To Meet Requirements?
Projects fail to meet requirements in two ways due to errors in two key areas:
- Requirement errors (miscommunication, lack of foresight, changing requirements)
- Implementation errors.
Software development is hard. Software made for businesses must suit their needs and solve problems in unique ways, but developers start with little knowledge about the business.
Gathering Business Knowledge Is Hard
There are two sides to this issue - customers and software engineers. Customers may not know exactly what they need, though they know what they want. This can be attributed to a few causes:
- no one person in the business has a clear picture of the entire scope of the problem
- people in different areas of the business have different priorities and opinions
- people with useful experience won't disclose business knowledge they take as a matter of course - this relates back to theory of mind as already discussed
- in general it is difficult to interrogate a mental picture (or documented design) of complex software to understand what key requirements are missing, confusing, or contradictory, even though after implementation these mistakes may be obvious
- customers sometimes insist on a design which is fundamentally flawed, particularly in the development of a new tech-core business where high-level blue-sky thinking conflicts with real-world implementation
On the other side, software engineers can also be responsible for "fudging requirements":
- software by design typically involves distilling a requirement to an unambiguous form that applies to all situations, to produce a predictable result. Engineers can be tempted to over-simplify requirements to make better or easier technical designs.
- engineers and project managers mentally draw similarities to other projects they have worked on to reduce the sense of risk and uncertainty.
- for similar reasons, engineers and managers may favour libraries, technologies, architectures and methodologies that they have used in the past, even if better options are available for the project.
For the last point, personal preferences of engineers can also apply to the point of irrationality. A famous example of this is the "GOTO" statement, which while identified as "harmful" as far back as 1968 (Dijkstra), was in such widespread use that it was not until the early 90s that GOTO started to really fall out of favour, culminating in it's removal from FORTRAN in 1995.
Programming Is Hard
Aside from the actual requirements gathering, actual programming ability is important, yet many companies hire developers who are self-taught or only have basic certificates.
Difficulties in creating software also go way beyond programming ability. Medium to large projects also require discipline and structure to ensure:
- adherence to the architectural design
- maintenance and reduction of technical debt
- anticipation of future requirements
- attention to non-functional requirements
- completeness of functional requirements
- adherence to budget/time
These are all areas that even experienced and certified programmers can find difficult to uphold. For a project to be successful, programmers have to simultaneously achieve all the above objectives, which are often in conflict. Some examples include:
- functional vs non-functional requirements. Functionality makes an action more achievable, but security (a non-functional requirement) makes an action more difficult.
- technical debt vs budget. Technical debt can make development tasks take more time by complicating the implementation, and this also furthers technical debt. However, redesigning to remove technical debt can take a lot of time without achieving any new functional requirements.
Then, of course there are also just normal human mistakes.
🗨️
Programmers call their errors “bugs” to preserve their sanity; that number of “mistakes” would not be psychologically acceptable. - M. E. Hopkins, Researcher at IBM, 1969
-
The culture and support around developers is also critical - optimal developer culture is a balancing act. Without a tool to track known issues, issues will be forgotten. But a complex or slow tool to track issues that no one wants to use will also produce a similar effect. Without a certain amount of accountability, cost to reputation or personal investment in a project, programmers won't be diligent checking for mistakes. Too much accountability leads to developer paralysis. Accurately determining who caused a problem is also a time consuming and potentially demoralising exercise. Without it, fewer lessons can be learned from failure and the risk of repeated mistakes is unmaintained.
Developer culture extends into the code itself. Creating consistent, well-designed, unit-testable, accurate and stable code takes time, and time is often short. In teams where senior developers oversee junior developers, the skill gap between different programmers can mean that achieving each objective simultaneously to the upmost standard is even more untenable. An acceptable code standard must be agreed upon, and must take into account the relative skills of team members as well as budgets.
Project Management Is Hard
Can you stop a tsunami if you see it coming? In order to manage a project, a manager must be aware of its state and correct as needed. Project management tools excel at seeing problems coming, but only have limited influence to prevent them. At the heart of it, if we can solve The Software Crisis and identify why projects fail, we can give project managers better tools to handle problems.
Currently, Project Managers have these limited controls over projects:
- delegation - set who is assigned to a task to cover a weakness or exploit an identified strength in the team. Managers can re-delegate when weaknesses or strengths become apparent or become less or more important.
- motivate - a team which isn't motivated will work slower or less diligently, but de-motivation can also be an effect of a greater problem that must be tackled first. Without solving core issues, "re-motivating" is merely cracking the whip augmented with socially conditioned brainwashing.
- structure - structure helps coordinate a team to produce optimal momentum. A good structure reduces confusion and keeps everyone on track and accountable without excessive management of the structure itself or demotivating the team - this is the developer culture balancing act.
- re-interpret/redesign - when requirements are vague enough, they can be redesigned to reduce complexity, often with some cost to other aspects, commonly by sacrificing aesthetics or to simplify the implementation complexity of functional requirements.
Each of these aspects may provide more effective utility of time, but it doesn't provide more time. Even re-designing scope just means the project manager is prioritizing the must-haves over the nice-to-haves by removing the optional objectives that are no longer achievable.
Project Managers can sometimes control the below, but often they are not privileged to do so:
- resource - increasing the budget to improve throughput, or rearranging budgets from various areas (for instance less testing budget for more development budget is a common adjustment that incurrs it's own problems).
- re-scope - slightly different to re-designing, changing the actual scope of the project to reduce work is often not possible due to contractual constraints, though project managers with good relationships with stakeholders or more flexible contracts may be able to.
Project Managers cannot stop a tsunami. Example tsunami's include:
- a budget that is completely off due to systematic overheads that can't be removed, or bad assumptions that produced an incorrect quote
- key requirements/assumptions no longer hold - such as if the project requires some third-party which no longer exists or has caused substantial changes, or the requirements are very incorrect or incomplete
- key resources in the team leave employ with no sufficient replacements that can take over their responsibilities
Testing Is Hard
People, outside of the Software Engineering field commonly believe testers verify software is free of bugs.
People who are testers commonly believe their role is to verify the minimum acceptable criteria for software.
This leads to a disconnect of expectations placed on testers. Testers want clear acceptable criteria for software from customers. Customers want bug-free code - what could be clearer? But testing for bug-free code is an impossible exercise. As far back as 1936, Turing proved it is impossible to create a computer algorithm to detect problems with algorithmic mistakes leading to infinite loops. We currently don't know if it is impossible to create an algorithm to detect long loops that itself is not too long - and this is just only one category of bug that could be present in a system - there are many more errors in system which are not detectable except under certain conditions, or at certain times, and identifying all of these might require testing the system in literally every conceivable combination of possible actions and states. This is impractical on simple systems, and impossible on large stateful systems.
Testing is also underappreciated in the cost of development; there is often time given to developers to correct unintended consequences, but often testers are pushed to pass code quickly to make up for that lost time. This naturally leads to more undetected issues.
Issues are Deferred
Finally, issues are encountered later in the project, the more the project is realised. Flaws may be introduced very early on in the project, but it may not be apparent that there is a problem until much later in the development cycle. If the flaw is a requirement miscommunication, it may not be encountered until delivery. This limits the ability to recover and adjust, and may have a wide-spread impact on the rest of the body of work.
Addressing the Crisis
If we want to fix the Software Crisis, we need to tackle the many multifaceted and complex issues at it's core. If the Crisis can be solved, we need not one, but several magic bullets:
- an accurate heuristic to rate task complexity that takes the strengths and weaknesses of the programmer into account in a non-subjective manner
- a methodology to reliably and completely interrogate requirements in any situation
- project management tools to identify and rectify all operational issues in project development, with access to all rectifying measures
- designers and engineers trained to make rational design decisions with complete knowledge of tools available in the domain and the skills of the team
- an early detection system to identify any mistakes against the requirements
- human beings who can apply above tools without fault
- a methodology to gain complete business knowledge and analysis of needs to inform the design
We have in fact made substantial progress since the 60s in all these areas, though we haven't truly "solved" these issues. We now have software methodologies which offer various improvements to aid manage projects, our education for engineers generally includes requirements gathering training, and automated unit tests identify programmer mistakes, at least some of the time.
Computers have been essential for helping alleviate The Software Crisis so far. We already rely on compilers to find syntax errors, auto-coders to provide better alternatives, to schedule and keep track of tasks/tickets, merge code from many sources, deploy and so on. It is not so unexpected then, that the recent explosion in AI capabilities offers further potential solutions. AI leveraged tools could:
- rate complexity of tasks based on description
- monitor developers to evaluate competency
- identify requirement errors
- rationally design system components with complete knowledge of all available tools
- monitor development throughout for issues
- Help developers with unfamiliar tasks or simple tasks.
On that last point though - we may still see AI assisted tools be stubbornly refused until the next generation of programmers come in - much as programmers of old refused to give up their precious GOTO statements despite clear disadvantages!
In the absence of AI tools ready today, all I can suggest for avoiding the worst of the crisis is to consider the above dangers in all your planing and management and budget accordingly. There is good reason why experienced and talented Project Managers, developers and testers are worth so much to avoid the myriad of issues. It would be nice to have this article solve all those problems, but 60 years on - our entire industry has only managed to stop drowning and start treading water; we are yet to out-swim the Tsunami.
References:
- P. Naur , B. Randell. "SOFTWARE ENGINEERING - Report on a conference sponsored by the NATO SCIENCE COMITTEE - Rome, Italy, 7th to 11th October 1968," NATO Science Comittee, Scientific Affairs Division, Brussels, Belgium, 1969.
- E.W. Dijkstra, "Go-to statement considered harmful", Commun. ACM vol. 11, no 3, pp. 147-148. March, 1968
- E. W. Dijkstra. (1972). The Humble Programmer [Online]. Available: https://www.cs.utexas.edu/~EWD/transcriptions/EWD03xx/EWD340.html
- Z. Chance, M. I. Norton, F. Gino and D. Ariely, "Temporal view of the costs and benefits of self-deception", PNAS vol. 108, sup 3, pp. 15655-15659. September, 2011
- "Theory of mind - Smarties task and Sally-Anne Task" [Online]. Available: https://youtu.be/41jSdOQQpv0
- B. Keysar, S. Lin, D. J. Barr, "Limits on theory of mind use in adults", Cognition vol. 89, issue 1, pp. 25-31. August, 2003.
- P. Naur , B. Randell. "SOFTWARE ENGINEERING TECHNIQUES - Report on a conference sponsored by the NATO SCIENCE COMITTEE - Rome, Italy, 27th to 31st October 1969," NATO Science Comittee, Scientific Affairs Division, Brussels, Belgium, 1970.
cover image: http://homepages.cs.ncl.ac.uk/brian.randell/NATO/N1968/GROUP7.html