[SystemSafety] AI and the virtuous test Oracle - action now!

Derek M Jones derek at knosof.co.uk
Sun Jul 23 13:38:02 CEST 2023


Les,

> Anthropic is also working on Mechanistic Interpretability - what is this?
> Dario Amodei’s words: “It's the science of figuring out what is going on inside
> the models. Explaining why and how it came up with the solutions it is
> providing. Important when it's creating output that we didn't expect. Like a
> brain scan to find out what's going on inside. “

Don't they have the source code to look at?

Plus, there are some good online tutorials that explain
how things work
https://huggingface.co/learn/nlp-course/chapter1/1?fw=pt

Some nuts and bolts
https://huggingface.co/docs/transformers/tokenizer_summary
https://huggingface.co/blog/encoder-decoder

> Given that an AI is likely to change its behaviour as a function of the data it
> senses in its environment, there is a need for continuous validation. This can
> only be achieved with a monitoring AI that is a permanent feature of a system
> for its operational life. An adult in the room if you like.

What other industry touts the extreme dangers of their products?
These large AI companies are desperate for regulation to
create a moat for their business model.  If everybody can
build and run models, it's not possible to charge
premium rates for regulated access.


-- 
Derek M. Jones           Evidence-based software engineering
blog:https://shape-of-code.com


More information about the systemsafety mailing list