#AI data governance
Explore tagged Tumblr posts
jcmarchi · 3 days ago
Text
Monetizing Research for AI Training: The Risks and Best Practices
New Post has been published on https://thedigitalinsider.com/monetizing-research-for-ai-training-the-risks-and-best-practices/
Monetizing Research for AI Training: The Risks and Best Practices
Tumblr media Tumblr media
As the demand for generative AI grows, so does the hunger for high-quality data to train these systems. Scholarly publishers have started to monetize their research content to provide training data for large language models (LLMs). While this development is creating a new revenue stream for publishers and empowering generative AI for scientific discoveries, it raises critical questions about the integrity and reliability of the research used. This raises a crucial question: Are the datasets being sold trustworthy, and what implications does this practice have for the scientific community and generative AI models?
The Rise of Monetized Research Deals
Major academic publishers, including Wiley, Taylor & Francis, and others, have reported substantial revenues from licensing their content to tech companies developing generative AI models. For instance, Wiley revealed over $40 million in earnings from such deals this year alone​. These agreements enable AI companies to access diverse and expansive scientific datasets, presumably improving the quality of their AI tools.
The pitch from publishers is straightforward: licensing ensures better AI models, benefitting society while rewarding authors with royalties. This business model benefits both tech companies and publishers. However, the increasing trend to monetize scientific knowledge has risks, mainly when questionable research infiltrates these AI training datasets.
The Shadow of Bogus Research
The scholarly community is no stranger to issues of fraudulent research. Studies suggest many published findings are flawed, biased, or just unreliable. A 2020 survey found that nearly half of researchers reported issues like selective data reporting or poorly designed field studies. In 2023, more than 10,000 papers were retracted due to falsified or unreliable results, a number that continues to climb annually. Experts believe this figure represents the tip of an iceberg, with countless dubious studies circulating in scientific databases​.
The crisis has primarily been driven by “paper mills,” shadow organizations that produce fabricated studies, often in response to academic pressures in regions like China, India, and Eastern Europe. It’s estimated that around 2% of journal submissions globally come from paper mills. These sham papers can resemble legitimate research but are riddled with fictitious data and baseless conclusions. Disturbingly, such papers slip through peer review and end up in respected journals, compromising the reliability of scientific insights​. For instance, during the COVID-19 pandemic, flawed studies on ivermectin falsely suggested its efficacy as a treatment, sowing confusion and delaying effective public health responses. This example highlights the potential harm of disseminating unreliable research, where flawed results can have a significant impact.
Consequences for AI Training and Trust
The implications are profound when LLMs train on databases containing fraudulent or low-quality research. AI models use patterns and relationships within their training data to generate outputs. If the input data is corrupted, the outputs may perpetuate inaccuracies or even amplify them. This risk is particularly high in fields like medicine, where incorrect AI-generated insights could have life-threatening consequences. Moreover, the issue threatens the public’s trust in academia and AI. As publishers continue to make agreements, they must address concerns about the quality of the data being sold. Failure to do so could harm the reputation of the scientific community and undermine AI’s potential societal benefits.
Ensuring Trustworthy Data for AI
Reducing the risks of flawed research disrupting AI training requires a joint effort from publishers, AI companies, developers, researchers and the broader community. Publishers must improve their peer-review process to catch unreliable studies before they make it into training datasets. Offering better rewards for reviewers and setting higher standards can help. An open review process is critical here. It brings more transparency and accountability, helping to build trust in the research. AI companies must be more careful about who they work with when sourcing research for AI training. Choosing publishers and journals with a strong reputation for high-quality, well-reviewed research is key. In this context, it is worth looking closely at a publisher’s track record—like how often they retract papers or how open they are about their review process. Being selective improves the data’s reliability and builds trust across the AI and research communities.
AI developers need to take responsibility for the data they use. This means working with experts, carefully checking research, and comparing results from multiple studies. AI tools themselves can also be designed to identify suspicious data and reduce the risks of questionable research spreading further.
Transparency is also an essential factor. Publishers and AI companies should openly share details about how research is used and where royalties go. Tools like the Generative AI Licensing Agreement Tracker show promise but need broader adoption. Researchers should also have a say in how their work is used. Opt-in policies, like those from Cambridge University Press, offer authors control over their contributions. This builds trust, ensures fairness, and makes authors actively participate in this process.
Moreover, open access to high-quality research should be encouraged to ensure inclusivity and fairness in AI development. Governments, non-profits, and industry players can fund open-access initiatives, reducing reliance on commercial publishers for critical training datasets. On top of that, the AI industry needs clear rules for sourcing data ethically. By focusing on reliable, well-reviewed research, we can build better AI tools, protect scientific integrity, and maintain the public’s trust in science and technology.
The Bottom Line
Monetizing research for AI training presents both opportunities and challenges. While licensing academic content allows for the development of more powerful AI models, it also raises concerns about the integrity and reliability of the data used. Flawed research, including that from “paper mills,” can corrupt AI training datasets, leading to inaccuracies that may undermine public trust and the potential benefits of AI. To ensure AI models are built on trustworthy data, publishers, AI companies, and developers must work together to improve peer review processes, increase transparency, and prioritize high-quality, well-vetted research. By doing so, we can safeguard the future of AI and uphold the integrity of the scientific community.
0 notes
michellesanches · 9 months ago
Text
Latest AI Regulatory Developments:
As artificial intelligence (AI) continues to transform industries, governments worldwide are responding with evolving regulatory frameworks. These regulatory advancements are shaping how businesses integrate and leverage AI technologies. Understanding these changes and preparing for them is crucial to remain compliant and competitive. Recent Developments in AI Regulation: United Kingdom: The…
Tumblr media
View On WordPress
1 note · View note
allhealwesttexas · 19 days ago
Text
"wrapped look bad. wrapped is AI" oh my god dont you get it. AI is not one thing. it's not some computer guy they dreamed up. it's not one algorithm. everything is AI. nothing is AI. it's all computer generated content. all of it. it is this year. it was last year & the year before that.
9 notes · View notes
nando161mando · 3 months ago
Text
My oh my. From ABC News Australia: "Facebook admits to scraping every Australian adult user's public photos and posts to train AI, with no opt-out option."
A summary:
"The company does not offer Australians an opt out option like it does in the EU, because it has not been required to do so under privacy law."
https://www.abc.net.au/news/2024-09-11/facebook-scraping-photos-data-no-opt-out/104336170
BTW the ABC News website has links to Facebook, Instagram and X but no Fediverse profiles. So their posts are used to train #Llama
5 notes · View notes
innonurse · 3 months ago
Text
Mark your calendar for these health tech conferences in 2024-2025
Tumblr media
- By InnoNurse Staff -
Interested in health technology-related events for fall 2024 and 2025? Fierce Healthcare has compiled a list of key conferences, both virtual and in-person, scheduled for the upcoming seasons.
Read more at Fierce Healthcare
///
Other recent news and insights
Lapsi transforms the stethoscope into a health tracking data platform (TechCrunch)
UK: The Department of Health and Social Care set to review clinical risk standards for digital health technologies (Digital Health)
AI-based cancer test determines if chemotherapy is needed (The Financial Express)
New tool enhances microscopic imaging by eliminating motion artifacts (UC Berkeley/Tech Xplore)
Researchers integrate a fast optical coherence tomography system into neurosurgical microscopes (Optica)
AI model achieves clinical-expert-level accuracy in complex medical scans (UCLA/Medical Xpress)
Bioinformatics reveals the hidden prevalence of repeat expansion disorders (Queen Mary University of London/Medical Xpress)
Ultrasound detects 96% of ovarian cancers in postmenopausal women (University of Birmingham)
AI ‘liquid biopsies’ using cell-free DNA and protein biomarkers could improve early ovarian cancer detection (Johns Hopkins Technology Ventures)
Mammograms show potential for detecting heart disease (UC San Diego/Medical Xpress)
IMRT and proton therapy provide similar quality of life and tumor control for prostate cancer patients (American Society for Radiation Oncology/Medical Xpress)
Machine learning enhances MRI video quality (Graz University of Technology/Medical Xpress)
Robotic surgery for colorectal cancer reduces pain and accelerates recovery (Beth Israel Deaconess Medical Center)
Global human brain mapping project releases its first data set (Allen Institute)
AI could speed up PCR tests, aiding faster DNA diagnostics and forensics (Flinders University/Medical Xpress)
AI-powered apps may detect depression through eye snapshots (Stevens Institute of Technology/Medical Xpress)
2 notes · View notes
vague-humanoid · 1 month ago
Text
At the California Institute of the Arts, it all started with a videoconference between the registrar’s office and a nonprofit.
One of the nonprofit’s representatives had enabled an AI note-taking tool from Read AI. At the end of the meeting, it emailed a summary to all attendees, said Allan Chen, the institute’s chief technology officer. They could have a copy of the notes, if they wanted — they just needed to create their own account.
Next thing Chen knew, Read AI’s bot had popped up inabout a dozen of his meetings over a one-week span. It was in one-on-one check-ins. Project meetings. “Everything.”
The spread “was very aggressive,” recalled Chen, who also serves as vice president for institute technology. And it “took us by surprise.”
The scenariounderscores a growing challenge for colleges: Tech adoption and experimentation among students, faculty, and staff — especially as it pertains to AI — are outpacing institutions’ governance of these technologies and may even violate their data-privacy and security policies.
That has been the case with note-taking tools from companies including Read AI, Otter.ai, and Fireflies.ai.They can integrate with platforms like Zoom, Google Meet, and Microsoft Teamsto provide live transcriptions, meeting summaries, audio and video recordings, and other services.
Higher-ed interest in these products isn’t surprising.For those bogged down with virtual rendezvouses, a tool that can ingest long, winding conversations and spit outkey takeaways and action items is alluring. These services can also aid people with disabilities, including those who are deaf.
But the tools can quickly propagate unchecked across a university. They can auto-join any virtual meetings on a user’s calendar — even if that person is not in attendance. And that’s a concern, administrators say, if it means third-party productsthat an institution hasn’t reviewedmay be capturing and analyzing personal information, proprietary material, or confidential communications.
“What keeps me up at night is the ability for individual users to do things that are very powerful, but they don’t realize what they’re doing,” Chen said. “You may not realize you’re opening a can of worms.“
The Chronicle documented both individual and universitywide instances of this trend. At Tidewater Community College, in Virginia, Heather Brown, an instructional designer, unwittingly gave Otter.ai’s tool access to her calendar, and it joined a Faculty Senate meeting she didn’t end up attending. “One of our [associate vice presidents] reached out to inform me,” she wrote in a message. “I was mortified!”
19K notes · View notes
defensenow · 4 days ago
Text
youtube
1 note · View note
alyfoxxxen · 6 days ago
Text
AI’s hype and antitrust problem is coming under scrutiny | MIT Technology Review
0 notes
hitechnectartrends · 13 days ago
Text
How can organizations balance risk and reward when adopting Generative AI
Organizations can balance risk and reward when adopting Generative AI (GenAI) by implementing a strategic approach that emphasizes due diligence, ethical frameworks, and robust governance. Here are key strategies to achieve this balance:
0 notes
jcmarchi · 11 hours ago
Text
Only 2.1% avoided generative AI in 2024: Find out why
New Post has been published on https://thedigitalinsider.com/only-2-1-avoided-generative-ai-in-2024-find-out-why/
Only 2.1% avoided generative AI in 2024: Find out why
Tumblr media
This year, only 2.1% of respondents in our Generative AI Report said they don’t use generative AI tools; this represented a decline from last year’s value of 11.8%, and we were interested in knowing what was behind this choice and this drop.
This significant drop suggests a variety of important underlying factors, such as increased awareness and understanding, broader accessibility, proven effectiveness and ROI, peer influence and industry trends, the evolution of technology, and cultural and organizational shifts.
ChatGPT
OV
Matlab
Ansys
Gemini
LLama
Midjourney
Create our own internally – 33.4%
Lack of interest – 33.3% 
Irrelevance – 33.3%
There was an even distribution of why respondents said they don’t use generative AI.
The choice to develop AI tools internally could mean there’s a preference for using customized solutions tailored to the specific business. This can be driven by worries over control or wanting to keep proprietary systems. 
Respondents also indicated a lack of interest as a reason for not adopting external generative AI tools. Potentially originating from a perceived lack of clear advantage or understanding of how these tools could benefit their specific operations, this decision points to a possible gap in awareness. 
Similarly, there’s a perception that generative AI tools are irrelevant to operations, meaning there could be a disconnect between AI technologies’ offerings and potential users’ needs or understandings. 
With all non-users of generative AI stating they’d be willing to use the tools in the future, this points to essential insights about perceptions and the evolution of technologies.
Generative AI tools are being broadly accepted and have a positive outlook, meaning they’ve proven their value and have managed to convince even previously hesitant individuals or companies of their potential benefits. 
There’s also a marked potential for widespread implementation of these tools across more varied sectors and tools as more non-users adopt the technology. We might see generative AI even more extensively integrated into business operations, potentially leading to a new wave of digital transformation.
Generative AI 2024: Key insights & emerging trends
Download the Generative AI 2024 Report for in-depth analysis on top tools, user benefits, and key challenges shaping the future of AI technology.
Tumblr media
The respondents highlighted the tools below as ones they’d consider using:
Task-dependent
LLaMa3
Matlab
OV
Ansys
The general consensus about the use of generative AI 
We wanted to know the opinion of the respondents’ companies about generative AI. 
For – 33.3%
Neutral – 66.7%
The majority of respondents (66.6%) indicated that their companies hold a neutral opinion about generative AI, with some (33.3%) also stating that their company has a favorable outlook.
The prevalence of this neutral view could suggest that many companies are still assessing the potential impacts and benefits of generative AI without yet committing fully to its adoption. 
Similarly to last year, there’s an absence of opposition, implying a potential open-minded attitude about generative AI and perhaps even its future adoption.
We wanted to know how much those who don’t use the technology trust it. Perhaps surprisingly, not everyone who said they’d use the technology in the future also said they trusted it.
Yes – 72.8%
No – 27.2%
The level of trust (72.8%) in generative AI tools remained the same as last year, continuing the trend of not everyone who said they’d use the technology in the future also trusting it.
Mostly positive – 33.3%
Mixed – 66.7%
The majority of respondents (66.7%) see the impact of the tools as mixed, which could suggest a nuanced understanding of the technology’s benefits against its challenges. This view could stem from the awareness that, while driving innovation and efficiency, generative AI has the potential to pose risks in ethics and bias.
How do you envision the role of generative AI evolving in your industry?
All non-users of generative AI view its role as a supplementary tool, which could underline that while having its uses, it’s not vital for core business operations. This could point to an opportunity for developers to educate and demonstrate AI’s broad benefits and capabilities.
Potential misuse of personal/generated data – 33.4%
Lack of transparency in data usage – 33.3%
Other – 33.3%
Potential misuse of personal/generated data and lack of transparency in data usage are equal concerns for non-users of the technology. It could indicate the fear of personal data being mishandled by AI systems or third parties.
How data is used also needs to be fully addressed; transparency over how data is used, handled, and stored by AI systems needs strong policies and regulations on data governance, which need to be properly communicated by AI providers.
Those who answered ‘other’ didn’t specify.
Download the full report to see why and how end users and practitioners are using generative AI tools.
Tumblr media
0 notes
garymdm · 14 days ago
Text
Data Management Trends for 2025
The data landscape is constantly evolving, driven by technological advancements and changing business needs. As we move into 2025, several key trends are shaping the future of data management. 1. AI-Powered Data Management Automated Data Processes: AI-driven automation will streamline tasks like data cleaning, classification, and governance. Enhanced Data Insights: AI-powered analytics will…
0 notes
nando161mando · 3 months ago
Text
Social networks in 2024 are giving the public every reason not to use them
Tumblr media
5 notes · View notes
hello-there · 4 days ago
Text
Tumblr media
Communities are a new way to connect with the people on Tumblr who care about the things you care about! Browse Communities to find the perfect one for your interests or create a new one and invite your friends and mutuals!
332 notes · View notes
rajaniesh · 2 months ago
Text
From Data to Decisions: Empowering Teams with Databricks AI/BI
🚀 Unlock the Power of Data with Databricks AI/BI! 🚀 Imagine a world where your entire team can access data insights in real-time, without needing to be data experts. Databricks AI/BI is making this possible with powerful features like conversational AI
In today’s business world, data is abundant—coming from sources like customer interactions, sales metrics, and supply chain information. Yet many organizations still struggle to transform this data into actionable insights. Teams often face siloed systems, complex analytics processes, and delays that hinder timely, data-driven decisions. Databricks AI/BI was designed with these challenges in…
0 notes
omegaphilosophia · 2 months ago
Text
Integrating AI to Improve Politics: Opportunities and Challenges
Artificial Intelligence (AI) has the potential to revolutionize various aspects of politics, from enhancing decision-making processes to increasing transparency and engagement. Here are several ways AI can be integrated into politics to bring about improvements:
1. Data Analysis and Decision-Making
AI can analyze large volumes of data to identify trends, patterns, and correlations that human analysts might miss. By leveraging machine learning algorithms, policymakers can make more informed decisions based on empirical evidence and predictive models. For instance, AI can be used to predict the outcomes of different policy options, helping governments to choose the most effective strategies.
2. Enhancing Transparency and Accountability
AI can help increase government transparency and accountability by monitoring and analyzing public spending, detecting anomalies, and identifying potential cases of corruption or misuse of funds. By providing real-time insights into governmental activities, AI can enable citizens to hold their representatives accountable and ensure that public resources are used efficiently.
3. Improving Public Services
AI can streamline public services by automating routine tasks, optimizing resource allocation, and improving service delivery. For example, AI-powered chatbots can provide citizens with instant responses to their inquiries, while AI-driven systems can optimize healthcare, education, and transportation services to better meet the needs of the population.
4. Enhancing Public Engagement
AI can facilitate greater public engagement in the political process by analyzing social media and other online platforms to gauge public opinion, identify key issues, and predict voter behavior. This information can help politicians and policymakers to better understand the concerns of their constituents and to develop policies that reflect the needs and preferences of the public.
5. Combating Misinformation
AI can play a crucial role in identifying and combating misinformation and fake news. By using natural language processing (NLP) algorithms, AI can detect misleading content, verify facts, and provide users with accurate information. This can help to create a more informed electorate and reduce the impact of misinformation on political discourse.
6. Enhancing Security and Cyber Defense
AI can bolster national security and cyber defense by detecting and responding to cyber threats in real-time. Machine learning algorithms can identify unusual patterns of behavior that may indicate a cyber attack and can help to protect critical infrastructure and sensitive data from being compromised.
7. Facilitating International Diplomacy
AI can assist in international diplomacy by analyzing geopolitical trends, predicting potential conflicts, and identifying opportunities for cooperation. AI can also support negotiations by providing real-time translation services, summarizing key points, and identifying areas of agreement and disagreement.
Challenges and Ethical Considerations
While AI has the potential to improve politics in numerous ways, there are also significant challenges and ethical considerations to address:
Bias and Fairness: AI systems can perpetuate or even exacerbate existing biases if they are trained on biased data. Ensuring that AI is fair and unbiased is crucial to its successful integration into politics.
Privacy: The use of AI in politics raises important questions about privacy and data security. Safeguarding citizens' personal information and ensuring that data is used responsibly is essential.
Accountability: Determining accountability for decisions made or influenced by AI can be complex. Establishing clear guidelines and regulations for the use of AI in politics is necessary to ensure transparency and accountability.
Public Trust: Building and maintaining public trust in AI-driven political processes is critical. Governments and political entities must be transparent about how AI is used and must demonstrate its benefits to gain public support.
The integration of AI into politics offers significant opportunities for enhancing decision-making, transparency, public engagement, and security. However, it also presents challenges that must be carefully managed to ensure that AI is used ethically and responsibly. By addressing these challenges, AI can become a powerful tool for improving political processes and outcomes, ultimately leading to better governance and a more informed and engaged citizenry.
0 notes
orcelito · 2 months ago
Text
I've got my advising appointment for next semester tomorrow, so I got to checking it all out. I will need to take Three more classes, out of the following...
Two of them have to be from the list of packaged app software solution, software development methods, and advanced systems design & integration. First two are prioritized, tho the third is there of either of the first aren't. Tho I'm a little confused bc none of these are listed in the class offerings for next semester??? But theyre requirements???? I'm gonna be asking my advisor about that.
And Then I have a selective. Just one. But I made a list of a bunch that I Could take & potentially would be useful in my career, ordered in preference priority:
Six sigma data quality
Enterprise data management
Policy, regulation, and globalization in information technology
Front end web coding
Applied machine learning
UNIX administration
Advanced systems development methodologies
Quality engineering in IT
Research methodology & design
Network engineering fundamentals
WHICH IS TO SAY...
A lot of tech possibilities lol. Last two are down there only bc they're restricted to another sub-major until open registration, so I probably won't get in. But I included them just in case :p
In general, out of my last semester I'd like to get more knowledge about coding and enterprise IT shit. Whatever could be most useful for me in an IT career.
1 note · View note
prolificsinsightsblog · 2 months ago
Text
0 notes