There’s been a lot of attention on GitHub’s CoPilot in recent months, it generates code for you based on the context of your other surrounding code in your editor, filling in the blanks with what their trained Machine Learning model ‘thinks’ you are attempting to code.
While impressive, it’s questionable from a legal perspective whether you can or should build a system for your company or a client based on code generated from other code shared by other developers and companies online who are unaware that their assets would be used by someone else in the future in the generation of new code derived from theirs. Generated code from a model that is trained on code written by other developers that have no contractural relationship with your company or client has legal consequences and questions that I don’t think at this point have been properly addressed by our industry, such as:
- who owns the generated code?
- who is responsible for the generated code when it fails?
- who is responsible for fixing bugs in the generated code? (the obvious answer is you and your company if you decide to use that generated code, but what if those bugs cause losses to your business as a consequence (from outages, or other functional issues), or worse, loss of lives (a risk of any safety critical code)
There’s currently ongoing legal action against Microsoft, claiming that Microsoft used developer’s code in their GitHub projects without permissions.
Writing code is not the hard part
AI code generation does not address any of the most significant problems in software development. Code Generation helps to write lines of code, but the act of typing lines of code at your keyboard is not where the majority of time and effort is spent in the overall process of building a software product or system. Generating code via any approach, whether using models (e.g. UML Class diagrams) or using AI Machine Learning trained models, is solving the smallest and probably simplest part of the software development process. Overall there is far more time and effort spent before developers start to actually write any lines of code, by the time you start writing code you’ve already reached the easy part, the majority of the hard work has been done. Areas that are considerably harder and/or where more time and effort is spent include:
- understanding the customer’s problem (understanding the requirements)
- designing an effective solution to solve the customer’s problem, given a number of constraints (e.g. time, budget and quality)
- finding acceptable compromises between competing priorities and needs between different areas of the business
What’s Next?
It can’t be denied that any tool that reduces time and effort to produce appropriate and effective solutions is still worthwhile, and there’s obviously improvements that we’ll see in the coming years with the current AI Models and their ability to help with code generation.
Whether the technology will continue to improve to the point where it can be fed a collection of vague requirements and generate a working system is still yet to be seen.
One Reply to “Generating code is easy; Understanding customer requirements is hard. (Or, why AI generated code is still decades away)”