#imo olympiad
Explore tagged Tumblr posts
cfalindia · 1 year ago
Text
Online Jee Coaching
In today's fast-paced world, online learning has become a game-changer, and CFAL Institute is at the forefront of this transformation with its top-notch online JEE coaching program. Aspiring engineers now have the flexibility to access the best JEE coaching from the comfort of their homes. CFAL's online JEE coaching eliminates geographical barriers, allowing students from all corners of India to benefit from its expertise. Whether you live in a bustling city or a remote town, you can access top-tier JEE coaching from the comfort of your home. This convenience is particularly beneficial in times when physical attendance at coaching centers may not be feasible.
0 notes
supriyawithoutsu · 2 years ago
Text
🥲
Tumblr media
13 notes · View notes
straightlightyagami · 1 year ago
Text
part of me regrets not trying to apply to universities in the US (even though realistically it’s too far away and too complicated a process and too expensive) bc I wonder if I could have gotten into the US top 10 ones or whatever
3 notes · View notes
townpostin · 4 months ago
Text
India's Unprecedented Success in Maths Olympiad: An Unsung Story
India’s exceptional achievement in the International Math Olympiad 2024, with four gold and one silver medal, receives little recognition in a country obsessed with cricket and films. India’s brilliant performance in the International Math Olympiad 2024, winning four golds and one silver, remains overshadowed by the nation’s focus on cricket and films. Indians who love cricket and films to the…
0 notes
Text
0 notes
rahulrainbow · 1 year ago
Text
WHICH OLYMPIAD IS BEST OR AUTHENTIC AT SCHOOL LEVELS?
Which Olympiad Is Best Or Authentic At School Levels? Recently, the 64th International Mathematical Olympiad (IMO) 2023 held at Chiba, Japan (July 2-13, 2023). Here, the six-member Indian team secured 2 Gold, 2 Silver and 2 Bronze medals. . India’s rank is 9th out of 112 countries. Homi Bhabha Centre for Science Education, Mumbai (HBCSE) is core organisation for participation in International…
Tumblr media
View On WordPress
0 notes
ickysubbyboi · 5 months ago
Note
hey pretty, please talk more about more stuff that you like, you’re so cute <3
Another one that I haven’t really talked about is liking math but not in a normal way probably more like a degenerate way…
Growing up I was like a lot into it. I used to compete in those math Olympiad tournaments. And as any other regular normal kid my dream was to go to the IMO (like the highest level international Olympiad) which is unrealistic af but I only found out that years later…
When I was in high school I started going to these other courses math course offered by my local university, which was 100% focused on math competitions. And you took like Number theory, algebra, geometry and combinatorics (number theory is obviously the nicer one) and they were really hard, some of it was in a college level so that’s probably why. But it’s funny how I was pretty average on those and it wasn’t easy for me. But on my class there was this dude who’d sit at the back and read books because the already college level math was too boring for him (he was like 14 💀) so it puts into perspective how good some of these people are…
Another weird thing is that at school sometimes I’d do math in even non math related classes 😭 and I never liked notebooks so I’d write stuff down on my table then erase it afterwards…pretty sure I got in trouble at some point for that but it was so much easier
One time when I was like 13/14 me and a couple friends decided to calculate 38! Factorial by hand (terrible idea) to calculate the total amount of possibilities we could arrange all students on the class room. This took hours, I mean hours we pretty much spent the whole school period doing it, for some reason I still don’t know why. There are more math related stories I think but I already typed way too much so…anyways. I almost ended up doing a mathematics bachelors but I went for engineering instead. But I’ll probably do a double major, or at least a mathematics minor just for fun
38! Is equal to: 523022617466601111760007224100074291200000000
Just to put into perspective how stupid that is 😭
40 notes · View notes
gamingavickreyauction · 5 months ago
Text
I haven't seen anyone talk yet about the fact that an AI solved 4/6 of this year's IMO problems. Is there some way they fudged it so that it's not as big a deal as it seems? (I do not count more time as fudging- you could address that by adding more compute. I also do not count giving the question already formalised as fudging, as AIs can already do that).
I ask because I really want this not to be a big deal, because the alternative is scary. I thought this would be one of the last milestones for AI surpassing human intelligence, and it seems like the same reasoning skills required for this problem would be able to solve a vast array of other important problems. And potentially it's a hop and a skip away from outperforming humans in research mathematics.
I did not think we were anywhere near this point, and I was already pretty worried about the societal upheaval that neural networks will cause.
4 notes · View notes
mt-lowercase-m-derogatory · 4 months ago
Text
So I have my second adhd evaluation in a week and like
I've been trained for years to on how to sit down for 4.5 hours while working on 3 particularly obtuse math problems. Mathematics Olympiad shit. I've won medals on an international level [not technically IMO tho so I'm not valid, I know this].
And maybe that should be taken into account when a "slightly above average" result on the computerised test is found during my adhd eval. Which is carried by me being in the top 1% for response speed because that part goes by quicker if you answer the questions faster.
2 notes · View notes
antialiasis · 2 years ago
Note
You mentioned being a part of the International Mathematical Olympiad. How did you end up doing in the tournament, and how did you get selected?
I went twice: first to Vietnam in 2007, where I got 6 points on one of the problems, and then to Spain in 2008, where I could have easily gotten full marks on a problem but because I forgot to prove a trivial edge case I ended up with four, which was a bummer. (The IMO is structured with six problems, three per day, with each one worth seven points, and you have 4.5 hours per day to solve them; the first problem each day is the easiest and the third the hardest.)
In both cases I believe I was in the middle of the Icelandic team, scoring-wise. As usual, Iceland is a tiny country, so when you pick the six best high-school-aged Icelanders at a thing they aren’t going to be great on a global scale. The first time I went, there was one guy who was a real supergenius on an Icelandic scale, who I think got a bronze medal (half of the participants get either gold, silver or bronze), which is a rarity for us. (Same guy was also a talented pianist and played in youth league football tournaments; the kind of guy you just sort of look at and go ‘Yeah, he is just going to be better than me at literally anything.’) If I recall correctly Iceland had won one or two silver medals at the IMO ever (and zero golds). Don’t know if there have been more by now.
The selection was done via two levels of math tournaments, one with more, easier problems and then, for the top 25 scorers on that, a second one with fewer, more IMO-like (but still easier than the real thing) problems. I think I placed something like 22nd in the first one for 2007, but much higher in the second, because I’m relatively better at that sort of problem. For 2008 I actually missed the first tournament for some reason, but they invited me to the second anyway because I’d been on the team the previous year, and again I placed in the top six. In 2009 I did take part in the first tournament again but managed to bungle it by spending too much time on one of the harder, more interesting problems instead of racking up points on the easier ones, so one way or another I didn’t make the cutoff to take part in the second one.
8 notes · View notes
cfalindia · 1 year ago
Text
https://www.cfalindia.com/olympiad-exam/
0 notes
supriyawithoutsu · 2 years ago
Text
🥲
Tumblr media
3 notes · View notes
straightlightyagami · 3 months ago
Text
i arrogantly believe if math olympiads were only combinatorics i could have been an imo gold medalist
6 notes · View notes
jcmarchi · 1 month ago
Text
The Toughest Math Benchmark Ever Built
New Post has been published on https://thedigitalinsider.com/the-toughest-math-benchmark-ever-built/
The Toughest Math Benchmark Ever Built
Frontier Math approach math reasoning in LLMs from a different perspective.
Created Using DALL-E
Next Week in The Sequence:
Edge 448: Discusses into adversarial distillation including some research in that area. It also reviews the LMQL framework for querying LLMs.
The Sequence Chat: Discusses the provocative topic of the data walls in generative AI.
Edge 490: Dives into Anthropic’s crazy research about how LLMs can sabotage human evalautions.
You can subscribe to The Sequence below:
TheSequence is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.
📝 Editorial: The Toughest Math Benchmark Ever Built
Mathematical reasoning is often considered one of the most critical abilities of foundational AI models and serves as a proxy for general problem-solving. Over the past few years, we have witnessed large language models (LLMs) push the boundaries of math benchmarks, scoring competitively on International Math Olympiad (IMO) problems and advancing discoveries in various areas of mathematics. From this perspective, it might seem as though LLMs are inching towards “super math powers,” but that is not entirely the case.
Much of AI’s impressive performance in math benchmarks relies on scenarios where the problem is perfectly articulated within a prompt. However, most foundational models struggle when they need to combine different ideas creatively or use “common sense” to structure and solve a problem. Can we develop benchmarks that measure these deeper reasoning capabilities?
Frontier Math, a new benchmark developed by Epoch AI, is designed to test the boundaries of artificial intelligence in advanced mathematics. Unlike traditional math benchmarks such as GSM-8K and MATH, where AI models now score over 90%, Frontier Math presents a significantly more challenging test. This higher difficulty stems from the originality of its problems, which are unpublished and crafted to resist shortcuts, requiring deep reasoning and creativity—skills that AI currently lacks.
From an AI standpoint, Frontier Math stands out by emphasizing the capacity for complex reasoning. The benchmark comprises hundreds of intricate math problems spanning diverse fields of modern mathematics, from computational number theory to abstract algebraic geometry. These problems cannot be solved through simple memorization or pattern recognition, as is often the case with existing benchmarks. Instead, they demand multi-step, logical thinking akin to research-level mathematics, often requiring hours or even days for human mathematicians to solve.
The problems within Frontier Math are specifically designed to test genuine mathematical understanding, making them “guess-proof.” This means that AI models cannot rely on pattern matching or brute-force approaches to arrive at the correct answer. The solutions, which often involve large numerical values or complex mathematical constructs, have less than a 1% chance of being guessed correctly without proper reasoning. This focus on “guess-proof” problems ensures that Frontier Math serves as a robust and meaningful test of an AI model’s ability to truly engage with advanced mathematical concepts.
Despite being equipped with tools like Python to aid in problem-solving, leading AI models—including GPT-4o and Gemini 1.5 Pro—have managed to solve fewer than 2% of the Frontier Math problems. This stands in stark contrast to their high performance on traditional benchmarks and highlights the significant gap between current AI capabilities and true mathematical reasoning.
Frontier Math provides a critical benchmark for measuring progress in AI reasoning as these systems continue to evolve. The results underscore the long journey ahead in developing AI that can genuinely rival the complex reasoning abilities of human mathematicians.
⭐️ Save your spot for SmallCon: A free virtual conference for GenAI builders! ⭐️
it’s bringing together AI leaders from Meta, Mistral AI, Salesforce, Harvey AI, Upstage, Nubank, Nvidia, and more for deep-dive tech talks, interactive panel discussions, and live demos on the latest tech and trends in GenAI. You’ll learn firsthand how to build big with small models and architect the GenAI stack of the future.
🔎 ML Research
Modular Models
This paper examines the potential of modular AI models, particularly focusing on the MoErging approach, which combines independently trained expert models to solve complex tasks. The authors, working at Microsoft Research Lab – New York City and Microsoft Research Lab – Montréal, propose a taxonomy for categorizing and comparing different MoErging methods, which can facilitate collaborative AI development and address challenges related to data privacy, model accountability, and continuous learning —> Read more.
Sematic Hub Hypothesis
This paper, authored by researchers from MIT, Allen Institute for AI and University of Southern California, propose the semantic hub hypothesis, suggesting that language models represent semantically similar inputs from various modalities close together in their intermediate layers. The authors provide evidence for this by showing that interventions in the dominant language (usually English) in this shared semantic space can predictably alter model behavior when processing other data types like Chinese text or Python code —> Read more.
GitChameleon
This work from researchers at Mila and the Max Planck Institute for Intelligent Systems presents GitChameleon, a benchmark of 116 Python-based problems that evaluate the capacity of large language models to generate code that correctly accounts for version changes in APIs. Analysis of several models on GitChameleon suggests a correlation between model size and performance on these tasks, indicating a need for future work on version-aware code generation methods —> Read more.
Stronger Models are not Stronger Teachers
This paper, written by authors from the University of Washington and the Allen Institute for AI, investigates the impact of different “teacher” models used to generate responses for synthetic instruction tuning datasets. Contrary to common assumptions, larger teacher models don’t necessarily lead to better instruction-following abilities in the tuned “student” models, a phenomenon the authors call the “Larger Models’ Paradox”. They propose a new metric called Compatibility-Adjusted Reward (CAR) to better select teacher models suited to a given student model for instruction tuning —> Read more.
Counterfactual Generation in LLMs
Researchers from the ETH AI Center and the University of Copenhagen introduce a framework in this paper for generating counterfactual strings from language models by treating them as Generalized Structural-equation Models using the Gumbel-max trick. Applying their technique to evaluate existing intervention methods like knowledge editing and steering, they find that these methods often cause unintended semantic shifts, illustrating the difficulty of making precise, isolated modifications to language model behavior —> Read more.
Watermarking Anything
This work by authors at Meta presents WAM, a new deep learning model that treats invisible image watermarking as a segmentation problem. The model excels at detecting, localizing, and extracting multiple watermarks embedded in high-resolution images while maintaining invisibility to the human eye and resisting attempts to remove or alter the watermarks —> Read more.
🤖 AI Tech Releases
Stripe for AI Agents
Stripe released an SDK for AI agents —> Read more.
Frontier Math
FrontierMath is, arguably, the toughest math benchmark ever created —> Read more.
AlphaFold 3
Google DeepMind open sourced a new version of its Alpha Fold model for molecular biology —> Read more.
🛠 Real World AI
Airbnb’s Photo Tours
Airbnb discusses their use of vision transformers to enable their photo tour feature —> Read more.
📡AI Radar
AI legend Francois Chollet announced he will be leaving Google.
Cogna raised $15 million to build AI that can write enterprise software.
OpenAI seems to be inching closer to launch an AI agent for task automation.
Perplexity is experimenting with ads.
AMDis laying off 4% of its global staff, approximately 1,000 employees, in an effort to gain a stronger foothold in the expanding AI chip market dominated by Nvidia.
Tessl.io, a company focused on AI-driven software development, has raised $125 million in funding to develop a new, open platform for AI Native Software.
Lume, a company that leverages AI to automate data integration, has secured $4.2 million in seed funding to address the persistent challenge of moving data seamlessly between systems.
Magic Story, launched a children’s media platform that utilizes AI to create personalized stories with the goal of nurturing confidence and growth in children.
ServiceNow, a digital workflow company, is releasing over 150 new generative AI features to its Now Platform, which includes enhancements for Now Assist and an AI Governance offering to ensure secure and compliant AI practices.
Red Hat is acquiring Neural Magicto bolster its hybrid cloud AI portfolio and make generative AI more accessible to enterprises.
Snowflake announced a series of key updates at its BUILD conference, focused on improving its AI capabilities and security, with notable additions including enhancements to Cortex AI, the launch of Snowflake Intelligence, and new threat prevention measures.
Sema4.ai has introduced its Enterprise AI Agent Platform, designed to empower business users with the ability to create and manage AI agents, ultimately aiming to automate complex tasks and streamline workflows.
DataRobot launched a new platform for creating generative AI applications. Specifically, the platform focuses on AI agents and collaborative AI.
Perplexity is experimenting with incorporating advertising on its platform to generate revenue for publisher partners and ensure the long-term sustainability of its services while emphasizing its commitment to providing unbiased answers.
Writer, a company focused on generative AI for enterprises, has successfully raised $200 million in Series C funding, reaching a valuation of $1.9 billion, with plans to utilize the new capital to further develop its full-stack generative AI platform and its agentic AI capabilities.
TheSequence is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.
0 notes
Text
0 notes
govindhtech · 4 months ago
Text
OpenAI o1-preview, o1-mini: Advanced Reasoning Models
Tumblr media
OpenAI o1-preview, OpenAI o1-mini, A new collection of models for reasoning that address challenging issues.
OpenAI o1-preview
OpenAI has created a new line of AI models that are meant to deliberate longer before reacting. Compared to earlier versions, they can reason their way through challenging tasks and tackle more challenging math, science, and coding challenges.
- Advertisement -
The first installment of this series is now available through ChatGPT and its API. OpenAI anticipates frequent upgrades and enhancements as this is only a preview. OpenAI is also including evaluations for the upcoming upgrade, which is presently being developed, with this release.
How it functions
These models were trained to think through situations more thoroughly before responding, much like a human would. They learn to try various tactics, improve their thought processes, and own up to their mistakes through training.
In OpenAI experiments, the upcoming model upgrade outperforms PhD students on hard benchmark tasks in biology, chemistry, and physics. It also performs exceptionally well in coding and math. GPT-4o accurately answered only 13% of the questions in an exam used to qualify for the International Mathematics Olympiad (IMO), compared to 83% for the reasoning model. Their coding skills were tested in competitions, and in Codeforces tournaments, they scored in the 89th percentile.
Many of the functions that make ChatGPT valuable are still missing from this early model, such as posting files and photographs and searching the web for information. In the near future, GPT-4o will be more capable in many typical instances.
- Advertisement -
However, this marks a new level of AI power and a substantial advancement for complicated thinking tasks. In light of this, OpenAI is calling this series OpenAI o1-preview and resetting the counter to 1.
Security
In the process of creating these new models, OpenAI is also developed a novel method for safety training that uses the models’ capacity for reasoning to force compliance with safety and alignment requirements. It can implement their safety regulations more successfully by reasoning about them in the context of the situation.
Testing how effectively their model adheres to its safety guidelines in the event that a user attempts to circumvent a process known as “jailbreaking” is one method they gauge safety. GPT-4o received a score of 22 (out of 100) on one of OpenAI’s most difficult jailbreaking tests, but OpenAI o1-preview model received an 84. Further information about this can be found in their study post and the system card.
OpenAI has strengthened its safety work, internal governance, and federal government coordination to match the enhanced capabilities of these models. This includes board-level review procedures, such as those conducted by its Safety & Security Committee, best-in-class red teaming, and thorough testing and evaluations utilizing its Preparedness Framework.
OpenAI recently finalized collaborations with the AI Safety Institutes in the United States and the United Kingdom to further its commitment to AI safety. OpenAI has initiated the process of putting these agreements into practice by providing the institutes with preliminary access to a research version of this model. This was a crucial initial step in its collaboration, assisting in the development of a procedure for future model research, assessment, and testing both before and after their public release.
For whom it is intended
These improved thinking skills could come in handy while solving challenging puzzles in math, science, computing, and related subjects. For instance, physicists can use OpenAI o1-preview to create complex mathematical formulas required for quantum optics, healthcare researchers can use it to annotate cell sequencing data, and developers across all domains can use it to create and implement multi-step workflows.
OpenAI O1-mini
The o1 series is excellent at producing and debugging complex code with accuracy. OpenAI is also launching OpenAI o1-mini, a quicker, less expensive reasoning model that excels at coding, to provide developers with an even more effective option. For applications requiring reasoning but not extensive domain knowledge, o1-mini is a powerful and economical model because it is smaller and costs 80% less than o1-preview.
How OpenAI o1 is used
Users of ChatGPT Plus and Team will have access to o1 models as of right now. The model selector allows you to manually choose between o1-preview and o1-mini. The weekly rate limits at launch will be 30 messages for o1-preview and 50 for o1-mini. The goal is to raise those rates and make ChatGPT capable of selecting the appropriate model on its own for each request.
Users of ChatGPT Edu and Enterprise will have access to both models starting next week.
With a rate limit of 20 RPM, developers that meet the requirements for API usage tier 5(opens in a new window) can begin prototyping with both models in the API right now. Following more testing, OpenAI aims to raise these restrictions. Currently, these models lack support for system messaging, streaming, function calling, and other capabilities in their API. Check out the API documentation to get started.
OpenAI also intends to provide all ChatGPT Free users with access to o1-mini.
Next up
These reasoning models are now available in ChatGPT and the API as an early release. To make them more helpful to everyone, it plans to add browsing, file and image uploading, and other capabilities in addition to model updates.
In addition to the new OpenAI o1 series, OpenAI also wants to keep creating and publishing models in its GPT series.
Read more on govindhtech.com
0 notes