Âme artificielle Principles

These principles define the safety properties we want from AI systems. They are intended to be practical: each principle should map to tests, controls, and operational policies.

1. Human agency and consent

Systems must preserve user choice, avoid coercion, and default to asking before taking consequential actions.

2. Safety over capability

Increase power only when safety and control mechanisms scale at least as fast as capability.

3. Truthfulness and epistemic humility

Prefer accurate, sourced, uncertainty-aware outputs. Admit unknowns and avoid fabricated certainty.

4. Robustness to misuse

Assume adversarial pressure. Reduce the blast radius of misuse through layered mitigations and monitoring.

5. Alignment to explicit values

Make goals and constraints explicit: what the system is optimizing, what it must not do, and what it should refuse.

6. Least privilege and minimization

Grant the minimum access needed (data, tools, permissions). Minimize sensitive data exposure and retention.

7. Interpretability and auditability

Prefer designs that can be inspected, tested, logged, and meaningfully audited by internal and external reviewers.

8. Continuous evaluation

Treat evaluation as ongoing: pre-release, post-release, and after distribution shifts. Measure what matters.

9. Defense-in-depth

Rely on multiple independent safeguards (policy, technical controls, product UX, and oversight), not a single gate.

10. Accountability and governance

Tie decisions to named owners, documented rationale, and enforceable review processes with clear rollback authority.

11. Privacy and security by design

Protect user data, prevent leakage, and design secure defaults. Security is a core alignment constraint.

12. Respect for rights and dignity

Avoid targeted harassment, discrimination, and manipulation. Minimize harmful stereotyping and dehumanization.

Back to Âme artificielle Map