HuggingChat, a free, open-source clone of ChatGPT, is released. No registration is necessary.
Hugging Face, a machine learning community and AI tools platform, has released HuggingChat, an open source ChatGPT clone that anyone may use or acquire.

Hugging Face

Hugging Face is both a company and an artificial intelligence (AI) community. It provides access to free, open-source development tools for machine learning and AI applications.
Hugging Face has recently completed a large language model with 176 billion parameters called Bloom, which is available to anyone who pledges to adhere to their Responsible AI licence.

HuggingChat

The Open Assistant Conversational AI Model serves as the basis for the HuggingChat ChatGPT clone.
Open Assistant is a project of the Large-scale Artificial Intelligence Open Network (LAION), a non-profit organisation.

LAION is a worldwide non-profit organisation dedicated to providing open-source access to cutting-edge technology.
They write, “OUR FAITH.”
We believe that machine learning research and its implementations have the potential to have enormously beneficial effects on our world and should therefore be democratised.

OUR PRINCIPAL GOALS

Providing public datasets, source code, and machine learning models
We intend to teach the fundamentals of large-scale machine learning research and data management.
By making models, datasets, and code reusable without the need to train from inception each time, we hope to promote the efficient use of energy and computing resources in the fight against climate change

The GitHub page for the Open Assistant conversation model states, “Open Assistant is a project that aims to provide everyone with access to an excellent chat-based large language model.”

We believe that by doing so, we will spark a revolution in language innovation.
In the same way that stable diffusion helped the world make art and images in new ways, we hope Open Assistant can help improve the world by improving language itself.”

HuggingChat Training Dataset

HuggingChat was trained with the OpenAssistant Conversations Dataset (OASST1), which is very new and contains data that was collected up to April 12, 2023.
The research paper for the dataset dates from April 2023 (OpenAssistant Conversations: Democratising Large Language Model Alignment, PDF).
This model uses the same training methodology created by OpenAI, called reinforcement learning from human feedback (RLHF).

RLHF is a technique for creating a high-quality, human-annotated, and quality-rated dataset of questions and answers that can be used to train an AI to follow directions.
With this release, they achieved their goal of putting the RLHF technique within reach of anyone who wants to train an AI.

“In an effort to democratise research on large-scale alignment, we release OpenAssistant Conversations, a human-generated, human-annotated assistant-style conversation corpus containing 161,443 messages distributed across 66,497 conversation trees, in 35 different languages, and annotated with 461,291 quality ratings,” stated the research paper.
The dataset is the product of a worldwide crowdsourcing effort by over 13,000 volunteers.
Crowdsourcing was a good way to generate multilingual training data, which contributed to a high-quality dataset.

However, according to the researchers, the crowdsourcing approach also introduced limitations in the quality of the dataset in the form of cultural and subjective biases of the individuals who created and rated the training data.
They also warned that participants who were more engaged tended to contribute more, thus creating an uneven distribution of their values and biases.
The researchers conclude that the dataset may not represent the diversity of viewpoints across all the contributors.

For example, they sent out a survey to their Discord channel (in English only) asking their open source contributors questions related to their demographics (but not ethnicity).
Setting aside the language bias, the results of the survey revealed that out of the 226 respondents, 201 were male, 10 were female, five identified as non-binary or other, and 10 declined to answer.

Nevertheless, although they don’t guarantee 100% that the dataset is free from harmful content, they still stand behind it because it was created with strict quality guidelines.
The researchers write:
“To ensure the quality of our dataset, we have established strict contributor guidelines that all users must follow.”

These guidelines are designed to prevent harmful content from being added to our dataset and to encourage contributors to generate high-quality responses.”
HuggingChat Is Available
HuggingChat is open for users right now. Registration to create a login account is not necessary to use it.

Don’t expect ChatGPT’s level of output; the service is not at that level yet. The app page lists it as version 0.0, which should give an idea of how mature it is at this point.
Nevertheless, it’s a remarkable achievement and a first step for the open source community, and there is absolutely no charge to use it.

Hugging Face Releases Free ChatGPT Clone HuggingChat

Hugging Face

HuggingChat

OUR PRINCIPAL GOALS

HuggingChat Training Dataset

Recent Posts

Categories

Agency

Company

We're Kind Of Serious