[SystemSafety] Putting Agile into a longer perspective

Thu Oct 24 22:09:04 CEST 2019

Olwen,

"I do occasionally do professional translations (FR-EN and DE-EN). Translation is the most difficult thing in the humanities."

Having studied French in school, and then self-studied both Korean and Chinese (none of which I would remotely consider myself fluent in, BTW), I would suggest that one part of this translation problem is the built-in ambiguity and verbose-ness that I just mentioned in my last reply. Another problem is that words in one language have definitions that don't overlap with nearly-equivalent words in the other language. For example, the Chinese word 家 (pronounced "jia" in Mandarin) can mean either house, family, or a combination of the two (among other things). So that when someone tries to translate 家 into English, sometimes house is the right translation, sometimes family is the right translation, but sometimes there isn't an equivalent English term to get across the originally intent. This is fairly easy to illustrate with translate.google.com<http://translate.google.com/>. Translate a sentence in one language to another, then translate the translation back to the source language. It can be entertaining how much really does get "lost in translation" because of non-equivalence of the words in either language.

"Therefore, I have long thought that software engineering would be a lot easier, and therefore cheaper, if we used but one formalism for specification, design and implementation."

Depending on what you mean by "specification", we may disagree. If by "specification" you mean something like "functional requirements" then I completely disagree. The purpose of non-trivial software is always to automate enforcement of some set of policies and execution of some set of processes that are distinct and separable from the technologies that one might use in automating those policies and processes. I have said before that "code is a mapping" (not my original idea, BTW, credit Sally Schlaer and Steve Mellor with that one). The lines of code are mapping policy and process semantics onto automation technology semantics, subject to the constraints that the mapping be 1) sufficiently complete, 2) preserve the original policy and process semantics, and 3) satisfy the non-functional requirements. Therefore, the formalisms necessary to precisely and concisely specify policy and process semantics must be different formalisms than the ones necessary to precisely and concisely specify mappings.

Using your example:

foo1 : gpio sequence of integer;

foo2 : network sequence of packet;

foo3 : file sequence of record;

note that there are two very distinct things going on. "gpio sequence of integer" is an automation technology semantic. So are "network sequence of packet" and "file sequence of record". Said another way, they specify explicit representation choices that some programmer has made. But, and unfortunately invisible at this level, are the intended external meanings of "foo1", "foo2", and "foo3". What do the integers in the gpio sequence mean to anyone outside of this code? Are they counts of some relevant phenomenon? Are the different integer values representations of some enumeration (0 means stop, 1 means go, 2 means pause, etc.) chosen because the gpio service doesn't support the concept of a type-safe enumeration? Could the line have instead been:

foo1 : gpio sequence of char;

where 's' means stop, 'g' means go, and 'p' means pause? If it's a sequence of integer, does the value -730509 have any valid external meaning? If so, what it it? If it's a sequence of char, does the value '%' have any valid external meaning? If so, what is it?

What are the intended external meanings behind foo1, foo2, and foo3? They are clearly different representation mechanisms, but are they intended to represent different external concepts? Or, are they intended to all just be different, but technologically-convenient at the time representations of exactly the same external concept?

So I counter-propose that the reason software is so expensive is that:

1) We (the industry as a whole) refuse to accept the importance of precisely specifying and validating external policy and process semantics. Any wishy-washy-ness and/or incompleteness in one programmer's understanding of policy and process semantics leads to equally wishy-washy code. When policy and process semantics are incompletely defined, the programmer just makes something up--with a clear non-zero probability that what they make up is wrong. When two or more programmers are involved, the problem is only amplified because programmer #1's intended meaning of something like foo1 can easily be different than programmer #2's interpretation of what they thought programmer #1 must have meant. Ha! It's the same as the non-equivalence of natural language words, above. In programmer #1's vocabulary foo1 means this but in programmer #2's vocabulary foo1 means that. And while this & that may be close, they turn out to be different enough to cause problems. "You call it a bug. I call it a defect. It is merely a semantic inconsistency".

2) Along with 1), we (the industry as a whole) refuse to accept the importance of separating the what-does-it-mean external policy and process semantics from the how-should-I-represent-it-in-technology semantics. This leads directly to a combinatorial explosion in complexity. "I'm not only unsure this is the best representation of external concept X, I'm not even clear on what external concept X is supposed to mean to the users". There's a recipe for disaster, no?

Therefore, if policy and process semantics were precisely and concisely specified and validated (using formalism set A) then it becomes clear and obvious to every single participant exactly what policy and process semantics are supposed to be automated in the first place. We can now focus exclusively on efficient and effective mappings (using formalism set B) of the given policy and process semantics onto the technology semantics. What used to be one big, hairy problem becomes two much simpler ones. Problem-space complexity is dealt with in formalism set A, solution-space complexity is dealt with in formalism set B. That is guaranteed to be a whole lot simpler than trying to manage the problem-space complexities and the solution-space complexities at the same time.

Cheers,

-- steve

From: Olwen Morgan <olwen at phaedsys.com<mailto:olwen at phaedsys.com>>
Date: Tuesday, October 22, 2019 at 4:50 AM
To: Steve Tockey <Steve.Tockey at construx.com<mailto:Steve.Tockey at construx.com>>, "systemsafety at lists.techfak.uni-bielefeld.de<mailto:systemsafety at lists.techfak.uni-bielefeld.de>" <systemsafety at lists.techfak.uni-bielefeld.de<mailto:systemsafety at lists.techfak.uni-bielefeld.de>>
Subject: Re: [SystemSafety] Putting Agile into a longer perspective

Steve Tockey wrote:

Unfortunately, building just the first one is really expensive. Who ever really throws it away and builds it again?

Agreed - but why is it expensive? I said I had some ideas on this, so here goes:

One thing that has often struck me about software development is how many different formalisms we use within it. And when we use different formalisms, we end up translating between them. As it happens, I do occasionally do professional translations (FR-EN and DE-EN). Translation is the most difficult thing in the humanities. Mathematics is the most difficult thing in the sciences. In software engineering, if we want to get the translation right, we have to accomplish the most difficult thing in the humanities using the most difficult thing in the sciences. Short wonder that it causes so much brain-ache.

Therefore, I have long thought that software engineering would be a lot easier, and therefore cheaper, if we used but one formalism for specification, design and implementation. It would probably need two forms, one graphical and one textual, but I cannot see any essential impediment to this. Obviously the textual form would be compilable (but more importantly also analysable), so we are thinking about something with the properties of a very high-level programming language - of a level comparable with that of, say Z alloyed with CCS. With modern languages, I can't see this to be much of a problem, since once you have sets of tuples (properly defined - not alla SQL), you have all the expressive power you need in that respect. Also, proof annotations should be an integral part of such a language.

Most importantly, however, such a language should support a concept that I call incremental binding, whereby not all the attributes of processes and objects need be declared all at once. In this way, following, say, a method such as SSADM (obviously heavily cut down - it's technical ideas are fine but the overblown documentation system is a pain) we could produce specification and design documents that have exact, analysable textual equivalents at all stages of development. When all the detail is there, the textual artefacts would be executable.

To do this, you have to abandon traditional PL design, and sadly also the concept of refinement. I'll give an example:

Consider the system context diagram. It names the system and its inputs and outputs. The textual form of this could be something like:

example: system fubar : inputs ( foo1, foo2), outputs (foo3) ;

Later on in the development more detail could be provided by binding the input and output names to particular file types. e.g.:

foo1 : gpio;

foo2 : network;

foo3 : file;

This would establish that foo1 comes from a gpio interface, foo2 from a network interface, and foo3 is a file. Then later (temporally but not spatially in the code) one could write:

foo1 : gpio sequence of integer;

foo2 : network sequence of packet;

foo3 : file sequence of record;

(If you're beginning to think that here I have in mind a data-logging application, you're right.)

In this way, the attributes of program objects would be defined incrementally by successive addition of attribute detail - but not by proof-requiring decomposition into substructures. During development balancing checks would determine where detail is missing and/or inconsistent across parts of the specification/design. This also implies that the analysis tools can determine, from the textual form of the evolving spec/design/program, which kinds of analyses can and cannot be performed on it. my idea is that you perform analyses every time you change the artefact and revert back to the previous version if something is wrong (continuous integration devops would help here). The language would require that relevant proof annotations be present at every stage of development - which would support early detection of errors by the soundest available means.

If you do programming in this way, you can have a textual form for every graphical (or tabular) specification and design artefact, that is analysable as soon as it it created. OK, only when you've got all, or at least most of, the detail do you have anything you can execute - but on the way to the executable artefact, you have stayed entirely within a single formalism - serving specification, design and coding - and you use a coordinated set of analytical tools from one end of the process to the other.

Such and end-to-end language, supported by a consistent set of tools, would, I believe, reduce process costs by an order of magnitude (big claim but no evidence - but that's how all progress begins).

Want to start throwing stones at  the idea? ... Feel free ... especially if you're Derek Jones ...  :-))

Olwen

PS: This idea is not by any means original. I can trace it back as least as far as Kit Grindley's Systematics:  A New Approach to Systems Analysis, Petrocelli Books, 1978, ISBN-10: 0894330209, ISBN-13: 978-0894330209 - IMO a seminal book decades ahead of its time. If he'd written it about 15 years later, by which time formal methods had a lot more traction, its main weakness - lack of formal semantics - might not have got it unjustly ignored.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/mailman/private/systemsafety/attachments/20191024/3ed67095/attachment-0001.html>