What’s Happening?
Epoch AI, a nonprofit, created a math test called FrontierMath to measure how good AI systems are at solving tough math problems.
Recently, it was revealed that OpenAI funded this project and used FrontierMath to showcase its new AI system.
Then what’s the main problem?
Epoch AI didn’t tell the public or the people who worked on FrontierMath that OpenAI was funding the project until December 20. Contributors got upset on not knowing that their work was using by OpenAI.
Concerns this problem have raised:
- Concerns Raised:
Transparency Issues:
Here’s the core problem; Transparency.
Secret collaboration and financial transactions made people doubt the FrontierMath as an unbiased test for AI.
- Exclusive Access:
Exclusive and early access got OpenAI to see the defaults in FrontierMath before its time, giving a competitive advantage to them as they don’t have the same access to other companies competing with FrontierMath.
What does Tamay – the Co-Founder have to say?
Epoch AI Admitted their Mistake:
One of their leaders, Tamay Besiroglu, said they couldn’t disclose the funding due to contractual limits with OpenAI but they should’ve ensured transparency as their no 1 priority. OpenAI didn’t use FrontierMath to train their AI,
The integrity of FrontierMath hadn’t been compromised – said Tamay.
Epoch AI has a separate set of math problems that they haven’t shown to anyone, including OpenAI. This hidden set is called a “holdout set”, and it’s used to double-check AI performance, ensuring fairness.
While they still tried to bring a fire brigade over the jungle fire, there was still some land left:
Epoch AI trusts that OpenAI is being honest about their AI system’s performance on the FrontierMath test; they haven’t done their checks to confirm if the results are accurate or not.
Their non-serious actions & responses put a mark on their legacy and credibility.
Why Does It Matter?
This event highlights the difficulty of creating fair tests for AI because of the reliance on funding from companies like OpenAI.
This can lead to further biases on how AI is evaluated.
Read More: MiniMax’s Advanced Models are Industry’s New Competition