DeepSeek has been trained with GPT4 outputs

DeepSeek has been trained with GPT4 outputs

It is undetermined whether DeepSeek has been trained with GPT4 outputs
DeepSeek has been trained with GPT4 outputs

Respuesta

There is evidence suggesting DeepSeek may have used GPT-4 outputs for training, specifically through a technique called distillation. Distillation is a method where a smaller model is trained using the output data from a larger, more capable model
1
. OpenAI has stated they found evidence linking DeepSeek to distillation from their models
1
2
3
. Some experts also believe DeepSeek’s model was trained using GPT-4 outputs, which would violate OpenAI's terms of service
3
. Additionally, the claim that DeepSeek replicated performance from scratch is questioned if they used GPT-4 outputs, as it would cast doubt on their training efficiency narrative
2
. It is important to note that DeepSeek is relatively new and has not provided clear information about its training data
4
, which contributes to the speculation. However, there's no direct public evidence confirming that DeepSeek explicitly used GPT-4 outputs as training data. While OpenAI suspects distillation
1
2
3
, they have not provided specific details of their evidence. Also, it's possible that DeepSeek used other sources for initial bootstrapping data
2
. DeepSeek’s architecture also differs from GPT-4, using a hybrid design combining transformers (statistical pattern recognition) and neuro-symbolic elements (rule-based logic)
5
, while GPT-4 uses a transformer-based design. The text also mentions alternative data generation methods, such as synthetic datasets from other sources
2
, making it difficult to confirm DeepSeek specifically used GPT-4 outputs. Finally, DeepSeek's own documentation states that their model, DeepSeek-R1-Zero, was trained using reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step and that DeepSeek-R1 incorporated cold-start data before RL, but it's not explicitly stated that this cold-start data was generated by GPT-4
2
. Based on the available text chunks, there is evidence suggesting that DeepSeek *might* have used GPT-4 outputs through distillation, based on OpenAI's claims and expert opinions
1
2
3
. However, this has not been definitively proven and DeepSeek has not confirmed this, making it difficult to ascertain the full truth
4
. Furthermore, DeepSeek's architecture and training processes differ from GPT-4
5
. While the possibility of using GPT-4 outputs exists, particularly with the lack of transparency in DeepSeek’s training data
6
4
, there is insufficient evidence to definitively conclude that DeepSeek was trained *primarily* or directly on GPT-4 outputs. Thus, the statement cannot be confirmed as true or false based on the provided information. Therefore, the truthfulness of the statement that "DeepSeek has been trained with GPT4 outputs" cannot be determined based on the available information.