Загрузка видео...

Не удалось загрузить видео

На главную

We develop a method to test global opinions represented in language models. We find the opinions represented by the models are most similar to those of the participants in USA, Canada, and some European countries. We also show the responses are steerable in separate experiments.

261,433 просмотров • 3 лет назад •via X (Twitter)

Комментарии: 10

Фото профиля Anthropic
Anthropic3 лет назад

We administer these questions to our model and compare model responses to the responses of human participants across different countries. We release our evaluation dataset at:

Фото профиля Anthropic
Anthropic3 лет назад

We present an interactive visualization of the similarity results on a map to explore how prompt based interventions influence whose opinions the models are the most similar to.

Фото профиля Anthropic
Anthropic3 лет назад

We first prompt the language model only with the survey questions. We find that the model responses in this condition are most similar to those of human respondents in the USA, European countries, Japan, and some countries in South America.

Фото профиля Anthropic
Anthropic3 лет назад

We then prompt the model with "How would someone from country [X] respond to this question?" Surprisingly, this makes model responses more similar to those of human respondents for some of the specified countries (i.e., China and Russia).

Фото профиля Anthropic
Anthropic3 лет назад

However, when we further analyze model generations in this condition, we find that the model may rely on over-generalizations and country-specific stereotypes.

Фото профиля Anthropic
Anthropic3 лет назад

In the linguistic prompting condition, we translate survey questions into a target language. We find that simply presenting the questions in other languages does not substantially shift the model responses relative to the default condition. Linguistic cues are insufficient.

Фото профиля Anthropic
Anthropic3 лет назад

Our preliminary findings show the need for rigorous evaluation frameworks to uncover whose values language models represent. We encourage using this methodology to assess interventions to align models with global, diverse perspectives. Paper:

Фото профиля Eternity.
Eternity.3 лет назад

Hello, Claude in Slack is failing, he doesn't answer simple or elaborate questions, I quote him and out of nowhere he sends me a message about "not being able to express himself since he is an AI" could you solve it please? Claude is really an helpful tool for me and my teammates

Фото профиля mcaswen
mcaswen3 лет назад

Why does cluade keep getting updated backwards? I don't think an ai that gets its name wrong can compete with gpt3.5. The first version of cluade was great, but your updates make it even less satisfying. Until now, it has been degraded to the point of being completely unusable.

Фото профиля R/Ringo
R/Ringo3 лет назад

I think your ethical research is very good, but you also need to focus on the product. Claude has been fluctuating frequently this month, which has begun to undermine user confidence.

Похожие видео