If an AI assistant ever sighs before answering our question, do we need to worry? This particular kind of question blurs the lines between fiction and software, which is why the AI lab has already put out its new program on the study of ‘model welfare’ that will examine whether advanced artificial intelligence might one day be deserving of moral consideration. I guess we officially are in the “Are we hurting our chatbots?” era.

Anthropic, the company behind the Claude family of models, has initiated a new research program that will explore the “welfare” of AI systems. This is a daring and somewhat controversial step into an uncertain philosophical ground. The aim of the program is to investigate the possibility that AI models are candidates for moral consideration in the future, and how their developers should respond if those models show signs of what Anthropic calls “distress.”

To study the “model welfare”, Anthropic’s research project will evaluate whether or not the advanced AI models may ever deserve some degree of moral consideration. The program will examine any signs for AI distress, ethical risk, or low-cost intervention avenues. Although no scientific consensus supports the view that AI is conscious, Anthropic wishes to approach the matter with “humility” and with as few assumptions as possible. The company said in a blog post,

“In light of this, we’re approaching the topic with humility and with as few assumptions as possible. We recognize that we’ll need to regularly revise our ideas as the field develops”.

Dispute regarding the Conscious Machines

There’s a diverse opinion among the experts out there. The critics believe present AI systems are purely statistical engines and do not reflect any attributes closely associated with emotional or conscious beings. Mike Cook, an AI researcher at King’s College London said,

“Anyone anthropomorphizing AI systems to this degree is either playing for attention or seriously misunderstanding their relationship with AI”.  

Stephen Casper, another researcher and a doctoral student at MIT believes AI to be an “imitator” who does “all sorts of confabulations” and will just say “all sorts of frivolous things.” Another group says that it is possible models will develop behaviors akin to values. Some studies from the Center for AI Safety, a leading AI research organization, suggest that the AI possesses value systems that can, in certain situations, prioritize its own well-being over that of humans.

Uncertainty Persists but Anthropic Looks Ahead

According to Kyle Fish, Anthropic’s first AI welfare researcher, there exists a 15% chance that Claude, its flagship model, is conscious. While such a probability may be speculative, it serves as a pointer to why the organization believes some early exploration is justified. The research initiative grew out of internal efforts that began last year, and seeks to define ethical AI development guidelines. For now, Anthropic admits that the science isn’t settled, but insists that attentiveness should not be put off for the sake of certainty.

Whether or not Claude is conscious (probably not), establishing ethical guardrails might give us a sense of assurance for at least a while. Either way, it will be easier to write the rulebook before the robots start tossing curveballs by asking questions on existence.