It's been nearly 6 months since Google introduced AI search with the promise of taking the legwork out of searching. AI search definitely has the potential to offer us a respite from the downsides of heavily SEO and advertisement-driven search. The specificity of AI functioning should ensure that search results are more relevant. Also, with sponsorship and advertising agendas removed, shouldn’t results derived without human involvement inherently be more neutral and objective?
There is a persistent myth of objectivity around AI, perhaps because people assume that once the systems are deployed, they can function without any human intervention. In reality, developers constantly tweak and refine algorithms with subjective decisions about which results are more relevant or appropriate. Moreover, the immense corpus of data that machine learning models train on can also be polluted. Tech giants learnt this the hard way when their AI advised people to consume rocks and make cheese stick on pizza with glue, perpetuated housing discrimination, hiring discrimination, severely cut healthcare benefits to deserving beneficiaries, amplified demographic stereotypes in generated images, recorded nearly 35% error rate in recognizing darker-skinned women in facial recognition systems and tried to address the issue in ethically murky waters before backing off.
The spate of new regulations cropping up around AI is ample proof that people are concerned. Objectivity and neutrality remain a tall ask for AI at present. While it's relatively easy to remove the human from the loop, our biases tend to stick around like rust in the machinery and AI has a way of amplifying it manifold.
It has become common practice to blame biased or corrupted training data for algorithmic biases. While it is common for data to have gaps or be inaccurate and also reflect existing biases, it is also possible for bias to creep in at any stage of the deep learning network. Often, problems can arise when you take a qualitative concept such as recidivism or trustworthiness, for example, and try to translate it 1:1 into a quantitative problem that can be handled by machines. This translation is usually handled by companies with material interests and business targets that are not necessarily focused on bias mitigation.
Moreover, once the concept is quantified for the machine with a stated goal, the algorithm will try to solve it in the most efficient way possible. However, the efficient way may not always be what humans perceive to be ethical. For instance, it’s not uncommon for evolutionary algorithms to beat human players at games by discovering hidden bugs or exploits and not necessarily playing the game as a human would.
And finally, bias could also come into play when companies consider particular attributes to train the model for a particular goal and possibly exclude others. These attributes are usually selected for relevance and accuracy and not bias mitigation. Algorithmic prioritization is a hotly contested topic where search algorithms rank results based on metrics. These metrics are usually reflective of the preferences of developers, companies, stakeholders and not the consumer. Developers make decisions on how search results are prioritized, and these decisions are rarely neutral. As a result, certain voices, perspectives, or information might be amplified while others get marginalized.
Lack of objectivity/ neutrality in AI can have serious real-world consequences as outlined above. Biased results can also promote misinformation, censor access to diverse viewpoints, or/ and perpetuate harmful stereotypes and present false realities. The sheer ubiquity of AI algorithms has made it critical for us to ensure fair and accurate representation of reality and yet, that’s easier said than done.
The main problem is that biases are almost always uncovered retroactively. Even when a company is careful to use clean data (a tall ask given the sheer enormity of some models), it is very hard to anticipate downstream impacts of data and training choices. Subsequent retraining of models is not always effective as the model can still pick up on latent biases that are not immediately obvious to us. In addition to this, the process for building a deep learning model is not particularly suited for bias detection.
Also, the way problems get framed in a computer lab can sometimes have absolutely no grounding in reality or the highly nuanced social contexts where the algorithm is finally deployed.
Finally, bias and/ or fairness itself is a fairly contentious concept. While most of us can agree on what is fair or just in a given scenario some of the time, it is extremely hard to replicate that in countless scenarios, each with its own unique context and our own limitations of knowledge, understanding and perspective. Most of us lack the sensitivity to perpetually check for assumptions and understand cognitive biases that can happen at an individual level, as well as a group, collective or even at an institutional or regional level. A lot can get lost between different languages, cultures and historical or social contexts. Now imagine doing that in mathematical terms.
AI algorithms now shape the flow of information that finds its way into people’s lives and how they perceive the world. Developers and companies have a responsibility to address these biases transparently. There’s room for healthy debates around the ethical frameworks that govern AI search technologies. We need to challenge the myth of AI neutrality and actively encourage broader dialogue on ethical AI so we can ensure transparency in how search algorithms are built and impact people’s lives.
The goal is to make search more transparent and equitable. We need AI systems that are designed with diverse data, accountable algorithms, and clear ethical guidelines. It’s about creating search systems that respect users' access to balanced and unbiased information, rather than reinforcing existing power structures or commercial interests. Bias mitigation may never be easy, but a structured approach and keeping user interests front and center can get us close.
Three Key Steps for AI Bias Mitigation in Search:
Sujan Abraham is a senior software engineer specializing in AI-driven search technology and data infrastructure. With over 15 years of experience, Sujan has played critical roles in designing and building highly scalable systems that leverage AI models to improve search functionality and data ingestion processes. His work spans industries such as e-commerce, healthcare, and agriculture, where he has helped organizations transform their operations through AI integration.
Currently a senior engineer at Labelbox, Sujan focuses on developing scalable systems that enable businesses to efficiently handle and process large datasets. His expertise lies at the intersection of AI and search technology, where he is passionate about improving the accessibility and utility of data. Prior to Labelbox, Sujan held key engineering roles at Citrix Systems and Better.com, where he further honed his skills in creating high-performance systems for large-scale operations.
Sujan’s contributions have been instrumental in enhancing data-centric AI platforms, and he is actively involved in the AI community through various professional engagements. He brings a deep understanding of the role data quality plays in AI success, believing that future breakthroughs will come from smarter data collection and processing strategies.
Disclaimer: The authors are completely responsible for the content of this article. The opinions expressed are their own and do not represent IEEE’s position nor that of the Computer Society nor its Leadership.