Introducing Artem-1

We’ve created Artem-1, the latest milestone in Veevt’s effort in scaling up deep learning.

Introducing Artem-1

We’ve created Artem-1, the latest milestone in Veevt’s effort in scaling up deep learning.

Over the past six months, we have collaborated extensively with Microsoft to develop a robust deep learning infrastructure tailored for training our sophisticated AI models. In the last year, significant advancements have emerged in the field of artificial intelligence, and today, we are thrilled to be actively contributing to this transformative journey.

Throughout the last six months, our team of 25 dedicated employees has closely collaborated with over 50 AI experts to craft our most advanced public AI model. Today marks the exciting unveiling of Artem-1, our cutting-edge language model for the next generation.

Capabilities

Artem-1 is a state-of-the-art language multimodal model with multilingual, reasoning, visual and coding capabilities.

  • Multilinguality: Artem-1 has been rigorously trained on multilingual text, covering over 130 languages. This has greatly improved its ability to understand, generate, and translate intricate text, including idioms, poems, and riddles, across a wide array of languages, which is typically a difficult task. Artem-1 also exhibits a high level of language proficiency, scoring mastery-level results in complex language examinations.
  • Reasoning: Artem-1's extensive training dataset includes scientific articles and web pages with mathematical expressions. As a result, it has a better aptitude for logic, common-sense reasoning, and mathematics.
  • Coding: Artem-1 was pre-trained on large, publicly accessible source code datasets, making it proficient in popular programming languages like Python, NextJS, and JavaScript. It can also generate specialized code in languages such as Prolog, Fortran, and Verilog.
  • Vision: Artem-1 is a model that performs well in a variety of multimodal tasks, such as visual understanding, classification, summarization, and generating content from images. It is adept at processing visual and textual inputs, including photographs, documents, infographics, and screenshots.

We recognize that AI models can produce biased outputs, and we are actively working to minimize this. Our aim is to develop AI systems that, by default, align with a broad spectrum of user values, while allowing for some customization within certain limits. We also welcome public feedback to help define these boundaries.

Artem-1 does not have knowledge of events that occurred after January 2024, which is the cutoff date for most of its training data. It may occasionally make simple reasoning errors, be overly trusting of false statements, and struggle with challenging problems, much like humans. Additionally, Artem-1 might confidently provide incorrect information without verifying its work when a mistake is likely.

As we continue to refine Artem-1, we will release periodic updates to improve its accuracy and performance.

Risks & mitigations

Artem-1, our AI model, has undergone numerous enhancements to ensure its safety and effectiveness during the training process. This includes meticulous data selection and filtering, expert evaluations, safety improvements, and stringent monitoring and enforcement mechanisms.

Despite our efforts to mitigate risks, Artem-1 may still produce harmful advice, faulty code, or incorrect information. The expanded capabilities of Artem-1 have also introduced new areas of concern.

To thoroughly evaluate these risks, we collaborated with over 50 experts from various fields, including AI alignment risks, cybersecurity, biorisk, trust and safety, and international security. Their insights have been invaluable in testing the model, particularly in high-risk domains that require expert evaluation. Their feedback has led to significant improvements in the model, such as the collection of additional data to improve Artem-1's ability to reject requests related to the synthesis of dangerous chemicals.

To further minimize harmful outputs, we've incorporated an extra safety reward signal during the Reinforcement Learning from Human Feedback (RLHF) training. This involves training the model to decline requests for harmful content, as defined by our usage guidelines, using a reward signal from an Artem-1 zero-shot classifier that assesses safety boundaries and completion style on safety-related prompts.

To prevent the model from refusing valid requests, we've gathered a diverse dataset from various sources, including labeled production data, human red-teaming, and model-generated prompts. We then apply the safety reward signal with positive or negative values to both allowed and disallowed categories.

Limitations

Artem-1, a model developed by Veevt, may sometimes provide inaccurate or nonsensical responses. This is a complex issue to resolve because: (1) Reinforcement learning training doesn't have a clear-cut standard of truth; (2) Increasing the model's caution can result in it declining to answer questions it could have correctly responded to; (3) Supervised training can potentially misguide the model as the perfect answer relies on its own knowledge, not the knowledge of the human demonstrator.

The model's responses can be influenced by the way a question is phrased or by repeated prompts. For example, it might claim to not know the answer to one formulation of a question, but provide the correct response when the question is slightly rephrased.

Artem-1 has a tendency to be overly detailed and to repeatedly use certain phrases, such as reiterating that it's a language model trained by Veevt. These issues are due to biases in the training data, where longer answers are preferred for their perceived thoroughness, and to over-optimization problems.

Despite efforts to make the model decline inappropriate requests, there are cases where it responds to harmful instructions or displays biased behavior. Veevt's Moderation API is used to issue warnings or block certain types of unsafe content.

API

To get access to the Artem-1 API, please sign up for our waitlist. We will start inviting some developers today, and scale up gradually to balance capacity with demand. If you are a researcher studying the societal impact of AI or AI alignment issues, you can also apply for subsidized access via our Researcher Access Program.