#Speech Recognition
Explore tagged Tumblr posts
Text
#IBM voice recognition#old computers#old tech#old commercials#1987#80s#80s commercials#speech recognition#on the computer screen#smart tech#VHS#retrowave#typeface#gif#my gifs#where we all are#in some form
51 notes
·
View notes
Text
Quick update on the State of the Nation & Very Important Technological Advancement:
The speech-to-text tool on my Android phone recognizes the word "destiel".
It's a little janky and apparently 50% likely to spontaneously delete all the other words in the sentence and just leave "destiel" for some reason.
But isn't that what Supernatural is really about? Aren't we really all just here in this fandom to forget all the words except for Destiel??
.... Now if I could JUST get speech-to-text to REMEMBER LITERALLY ANY ETHNIC NAME, THAT'D BE GREAT.
I know for a fact that it is possible and even relatively easy to teach speech recognition software to register new words because I used to work testing and calibrating Alexa apps. I KNOW HUMANITY HAS THE TECHNOLOGY, DAMMIT! - But I haven't been able to find a speech-to-text app that allows me to do this. Anyone else have more success than me?
#original#spn#destiel#speech-to-text#speech recognition#speech recognition software#speech to text#carpal tunnel#one of the main characters in my graphic novel is named Kuruk which is a rare Plains Indian Pawnee name and lemme tell ya#speech to text will not accept this name no matter what i do#some notable guesses it's made: cool rock. iraq. curl Rick. korok. correct. Clorox. cur lock. kurok - oh that one is actually so close!!!#typing accessibility#accessibility#accessibility software#disabled writer
4 notes
·
View notes
Text
What is OpenAI Whisper Speech Recognition?
Discover how OpenAI Whisper speech recognition is transforming audio processing. From transcriptions to translations, explore its features and applications. Perfect for businesses, educators, and creators looking to enhance productivity with #AI
OpenAI Whisper speech recognition is a powerful tool designed to transform spoken words into text. Built by OpenAI, Whisper uses advanced AI technology to handle tasks like transcription and translation. It stands out for its ability to process audio accurately. This is true even in noisy environments or with heavy accents. This makes it one of the most reliable speech recognition tools available…
0 notes
Text
As Melhores IAs de Conversação com Fala Gratuitas
Introdução às IAs de Conversação com Fala Nos últimos anos, as IAs de conversação com fala têm ganhado destaque em diversas áreas, desde assistentes pessoais até chatbots empresariais, passando por sistemas de automação doméstica. Esses sistemas utilizam tecnologias avançadas de reconhecimento de fala, processamento de linguagem natural (NLP) e síntese de fala (Text-to-Speech) para permitir uma…
#AI chatbots#AI for customer service#AI-driven chatbots#Conversational AI#Generative AI#ia#Inteligencia Artificial#Interactive voice response (IVR)#Machine learning (ML) for AI#Natural language processing (NLP)#Speech recognition#Speech-to-text API#Text-to-speech (TTS)#Virtual assistants#Voice assistants#Voice interfaces#Voice recognition
0 notes
Text
How To Create A Speech Recognition System Using HTML, CSS And JavaScript - Sohojware
The way we interact with technology is constantly evolving. Gone are the days of clunky keyboards and endless typing. Speech recognition systems, a form of Artificial Intelligence (AI), have emerged as a powerful tool, allowing us to interact with our devices through the power of our voice. This technology has many applications, from creating voice-controlled assistants to transcribing audio recordings.
In this article, brought to you by Sohojware, a leading US-based software development company, we'll delve into the exciting world of speech recognition systems and guide you through building a basic one using HTML, CSS, and JavaScript.
What is a Speech Recognition System?
A speech recognition system (speech recognition system), also known as Automatic Speech Recognition (ASR), is a technology that converts spoken language into text. Imagine being able to dictate emails, search the web, or control your smart home devices using just your voice. Speech recognition systems are making this a reality, transforming the way we interact with computers and the digital world.
Benefits of Speech Recognition Systems
Speech recognition systems offer a multitude of advantages, including:
Increased Accessibility: Speech recognition systems empower individuals with disabilities or those who struggle with typing to interact with technology more easily.
Enhanced Productivity: Speech recognition systems can significantly boost productivity by allowing users to dictate tasks and commands instead of manually typing.
Improved Accuracy: Speech recognition systems can potentially reduce errors by eliminating the need for manual data entry.
Hands-free Interaction: Speech recognition systems enable hands-free control of devices, allowing for multitasking and greater convenience.
Building a Basic Speech Recognition System with HTML, CSS, and JavaScript
Sohojware is dedicated to empowering developers and enthusiasts of all levels. Here's a step-by-step guide to creating a simple speech recognition system using these fundamental web technologies:
1. HTML Structure
First, we'll establish the basic structure of our web page using HTML. Let's create an index.html file with the following code:
This code creates a basic HTML document with a title, a link to a CSS stylesheet (style.css), and a container (div) for our speech recognition system. Inside the container, we have a button to initiate recognition and a div to display the recognized text (transcript). Finally, we include a script tag that references an external JavaScript file (script.js) containing the core functionality.
2. CSS Styling (style.css)
Now, let's add some visual appeal to our application using CSS:
This code simply styles the elements within our speech-container div, providing a centered layout, margins, and basic button and text styling. You can customize these styles further to match your preferences.
3. JavaScript Functionality (script.js)
The magic happens in the JavaScript code. Here's what goes inside the script.js file:
This code:
Retrieves elements: Select the start button and transcript element from the HTML document.
Adds event listener: Attaches a click event listener to the start button.
Creates recognition object: Initializes a webkitSpeechRecognition object.
Sets language: Specifies the language for recognition (in this case, English-US).
Handles results: Defines a callback function for the onresult event, which is triggered when the recognition engine receives speech data. The recognized text is extracted and displayed in the transcript element.
Handles errors: Defines a callback function for the onerror event, which is triggered if an error occurs during recognition. The error message is logged to the console.
Starts recognition: Begins the speech recognition process by calling the start() method on the recognition object.
Additional Considerations
Browser Compatibility: While webkitSpeechRecognition is widely supported, it's essential to consider browser compatibility and provide alternative solutions for older browsers.
Error Handling: Implement more robust error handling to provide informative feedback to the user in case of recognition errors.
Accuracy: Experiment with different language models and settings to improve recognition accuracy for specific use cases.
Privacy: Be mindful of privacy concerns when handling speech data, especially in sensitive contexts. Consider using secure and privacy-preserving technologies.
Conclusion
By following these steps and leveraging the power of HTML, CSS, and JavaScript, you can create a functional speech recognition system that enhances user interaction and opens up new possibilities for your web applications. Sohojware, a leading US-based software development company, is committed to providing innovative solutions and empowering developers like you to build cutting-edge applications.
FAQs
How can I improve the accuracy of my speech recognition system?
Experiment with different language models and settings.
Consider using a cloud-based speech recognition service for higher accuracy.
Provide clear and concise prompts to guide the user's speech.
Can I use speech recognition to control other elements on my web page?
Absolutely! You can use JavaScript to trigger events or manipulate elements based on the recognized speech.
How can I ensure privacy when using speech recognition?
Consider using secure and privacy-preserving techniques to handle speech data.
Inform users about your privacy practices and obtain their consent.
What are some common use cases for speech recognition systems?
Voice-controlled assistants
Transcription of audio recordings
Accessibility features for individuals with disabilities
Hands-free control of devices
Can Sohojware assist me in developing a more advanced speech recognition system?
Yes, Sohojware offers custom software development services to help you create sophisticated speech recognition systems tailored to your specific needs.
1 note
·
View note
Text
Go Beyond Basic Chatbot: Explore Advanced NLP Techniques
In our day-to-day lives, most of us have come across using a chatbot, maybe without even knowing it. Advanced NLP Techniques help these chatbots understand human queries to provide a solution. But have you ever wondered what a Chatbot is? Or how it works and what its functions are? Let’s find out What a Chatbot is. A chatbot is a computer program that operates through the cloud(at the backend),…
View On WordPress
#Chatbot#Named Entity Recognition#Natural Language Generation#Natural Language Processing#Natural Language Understanding#NLP#NLP Techniques#Speech Recognition
0 notes
Text
Digital Content Accessibility
Discover ADA Site Compliance's solutions for digital content accessibility, ensuring inclusivity online!
#AI and web accessibility#ChatGPT-3#GPT-4#GPT-5#artificial intelligence#AI influences web accessibility#AI-powered tools#accessible technology#tools and solutions#machine learning#natural language processing#screen readers accessibility#voice recognition#speech recognition#image recognition#digital accessibility#alt text#advanced web accessibility#accessibility compliance#accessible websites#accessibility standards#website and digital content accessibility#digital content accessibility#free accessibility scan#ada compliance tools#ada compliance analysis#website accessibility solutions#ADA site compliance#ADASiteCompliance#adasitecompliance.com
0 notes
Text
TOP 10 COMPANIES IN SPEECH-TO-TEXT API MARKET
The Speech-to-text API Market is projected to reach $10 billion by 2030, growing at a CAGR of 17.3% from 2023 to 2030. This market's expansion is fueled by the widespread use of voice-enabled devices, increasing applications of voice and speech technologies for transcription, technological advancements, and the rising adoption of connected devices. However, the market's growth is restrained by the lack of accuracy in recognizing regional accents and dialects in speech-to-text API solutions.
Innovations aimed at enhancing speech-to-text solutions for specially-abled individuals and developing API solutions for rare and local languages are expected to create growth opportunities in this market. Nonetheless, data security and privacy concerns pose significant challenges. Additionally, the increasing demand for voice authentication in mobile banking applications is a prominent trend in the speech-to-text API market.
Top 10 Companies in the Speech-to-text API Market
Google LLC
Founded in 1998 and headquartered in California, U.S., Google is a global leader in search engine technology, online advertising, cloud computing, and more. Google’s Speech-to-Text is a cloud-based transcription tool that leverages AI to provide real-time transcription in over 80 languages from both live and pre-recorded audio.
Microsoft Corporation
Established in 1975 and headquartered in Washington, U.S., Microsoft Corporation offers a range of technology services, including cloud computing and AI-driven solutions. Microsoft’s speech-to-text services enable accurate transcription across multiple languages, supporting applications like customer self-service and speech analytics.
Amazon Web Services, Inc.
Founded in 2006 and headquartered in Washington, U.S., Amazon Web Services (AWS) provides scalable cloud computing platforms. AWS’s speech-to-text software supports real-time transcription and translation, enhancing various business applications with its robust infrastructure.
IBM Corporation
Founded in 1911 and headquartered in New York, U.S., IBM Corporation focuses on digital transformation and data security. IBM’s speech-to-text service, part of its Watson Assistant, offers multilingual transcription capabilities for diverse use cases, including customer service and speech analytics.
Verint Systems Inc.
Established in 1994 and headquartered in New York, U.S., Verint Systems specializes in customer engagement management. Verint’s speech transcription solutions provide accurate data via an API, supporting call recording and speech analytics within their contact center solutions.
Download Sample Report Here @ https://www.meticulousresearch.com/download-sample-report/cp_id=5473
Rev.com, Inc.
Founded in 2010 and headquartered in Texas, U.S., Rev.com offers transcription, closed captioning, and subtitling services. Rev AI’s Speech-to-Text API delivers high-accuracy transcription services, enhancing accessibility and audience reach for various brands.
Twilio Inc.
Founded in 2008 and headquartered in California, U.S., Twilio provides communication APIs for voice, text, chat, and video. Twilio’s speech recognition solutions facilitate real-time transcription and intent analysis during voice calls, supporting comprehensive customer engagement.
Baidu, Inc.
Founded in 2000 and headquartered in Beijing, China, Baidu is a leading AI company offering a comprehensive AI stack. Baidu’s speech recognition capabilities are part of its diverse product portfolio, supporting applications across natural language processing and augmented reality.
Speechmatics
Founded in 1980 and headquartered in Cambridge, U.K., Speechmatics is a leader in deep learning and speech recognition. Their speech-to-text API delivers highly accurate transcription by training on vast amounts of data, minimizing AI bias and recognition errors.
VoiceCloud
Founded in 2007 and headquartered in California, U.S., VoiceCloud offers cloud-based voice-to-text transcription services. Their API provides high-quality transcription for applications such as voicemail, voice notes, and call recordings, supporting services in English and Spanish across 15 countries.
Top 10 companies: https://meticulousblog.org/top-10-companies-in-speech-to-text-api-market/
0 notes
Text
Delhi Hearing Aid & Speech Therapy Center is a premier destination for comprehensive hearing and speech services in Delhi. Our Speech Therapy Center in Delhi offers specialized treatments to enhance communication skills and address speech disorders effectively. With a dedicated Hearing Aid shop, we provide a wide selection of advanced hearing aids and reliable Hearing aid repair services to ensure optimal hearing solutions. At our Hearing And Speech Clinic, we conduct thorough assessments, including Pure Tone Audiometry Test and Free Field Audiometry Test, to accurately diagnose hearing issues. Our experienced speech-language therapists offer personalized therapy sessions to improve speech and language abilities. Trust us for expert care and tailored solutions for all your hearing and speech needs in Delhi.
#speech development#speech#hearing voices#hearing test#speech disorder#speech recognition#hearing#hearing solution#hearing support supplement#hearing protection
1 note
·
View note
Text
#AI voice technology#Content creation#Natural language processing#Machine learning#Voice assistants#Virtual assistants#Speech recognition#Audio content#Personalization#Automation#Creative industries#Future of work#Digital transformation#User experience#Innovation.
0 notes
Text
#Web accessibility#Accessibility Initiative#WCAG#Section 508#Screen Readers#Disabilities#Universal Design#Web Accessibility Initiative#Color Contrast#Assistive Technologies#Screen Magnifiers#Speech Recognition#Low vision#Designer Accessibility#Accessibility Standards#AELData#Accessibility Audit
0 notes
Text
https://www.workbyspeech.com/
#speech recognition#voice recognition#continuous speech#speech to text analysis#customizable macros#macro recording
1 note
·
View note
Text
#Voice Translator#Language Translation#Real-time Translation#Multilingual Communication#Speech-to-Text#Text-to-Speech#Language Converter#Communication Tool#Travel Companion#Language Learning#Multilingual Support#International Communication#Translate Voice#Speech Recognition#Language Interpreter#Conversation Translator#Travel Language App#Language Exchange#Multilingual Dictionary#Instant Translation#Cross-language Communication#Voice Recognition#Translator App#Foreign Language Learning#Speech Translation#Language Converter App#Interpreter Tool#Multilingual Conversation#Language Services#Global Communication
1 note
·
View note
Text
Meta Releases SeamlessM4T Translation AI for Text and Speech
Meta took a step towards a universal language translator on Tuesday with the release of its new Seamless M4T AI model, which the company says can quickly and efficiently understand language from speech or text in up to 100 languages and generate translation in either mode of communication. Multiple tech companies have released similar advanced AI translation models in recent months. In a blog…
View On WordPress
#Applications of artificial intelligence#Computational linguistics#Creative Commons#Gizmodo#Human Interest#Internet#Large language model#Machine translation#Massively Multilingual#Massively Multilingual Speech system#META#Meta AI#Multilingualism#Paco#Paco Guzmán#Speech recognition#Speech synthesis#Technology#Translation
0 notes
Text
Speech to Text App: The Emergence of SpeechFlow
In today’s fast-paced world, digital tools have become the backbone of our daily operations. At the forefront of these technological marvels is the speech to text app. Among its noteworthy players, SpeechFlow emerges as a titan, promising a blend of accuracy, efficiency, and convenience. With the spoken word becoming a dominant form of communication, the demand for precise transcription tools has never been higher. SpeechFlow, in particular, captures the essence of this demand.
The Inception and Rise of SpeechFlow
Venturing back, the idea behind SpeechFlow was simple: transform the spoken word into text flawlessly. The creators envisioned an app that wasn’t just a tool but an ally for professionals across sectors. Over time, with relentless innovation and upgrades, SpeechFlow ascended the ranks. It wasn't merely a speech to text app; it became a benchmark.
A unique aspect of SpeechFlow is its dedication to continuous improvement. Feedback loops with users, rigorous testing, and staying attuned to the latest AI advancements have kept it ahead of the curve. It's not just the tech, though; it's the human touch. SpeechFlow's team understood that while technology drives the product, human needs shape it. This balance between tech prowess and user understanding fueled its rise.
SpeechFlow vs. The Rest: What Makes It Stand Out?
The digital marketplace is rife with speech to text apps, but SpeechFlow carves a niche. Here's why:
Evolutionary Learning: Unlike static apps, SpeechFlow learns as it transcribes. Each task refines its subsequent performance, making it more accurate over time.
Collaborative Features: Beyond transcription, it allows multiple users to collaborate, comment, and share transcriptions, making teamwork seamless.
Integration Capabilities: Recognizing the diverse software ecosystems in businesses, SpeechFlow can integrate with a plethora of tools, from CRMs to content management systems.
Support & Community: Their robust support system, combined with an active community forum, means users never feel left in the dark.
Applications and Success Stories
From lecture halls in universities to board rooms in Fortune 500 companies, SpeechFlow's presence resonates.
Education: Professors and lecturers utilize it to transcribe lectures, providing students with accurate notes. This aids especially in online learning, where face-to-face interaction is minimal.
Medical Field: Doctors, amidst their busy schedules, no longer need to type out diagnoses or patient histories. A simple dictation, and SpeechFlow ensures it's accurately transcribed and stored.
Media and Journalism: Field reporters, podcasters, and interviewers find immense value in SpeechFlow. Capturing every spoken detail is vital, and the app ensures it's done meticulously.
A notable success story is that of a leading news agency that reduced its transcription time by 60% upon switching to SpeechFlow. The ripple effect was evident – quicker news cycles, timely updates, and a significant boost in audience engagement.
The Path Ahead: Predictions and Enhancements
SpeechFlow, in its current form, is powerful. But the vision is grander. The roadmap ahead includes:
Real-time Translations: Imagine a tool that doesn’t just transcribe but translates simultaneously. For global businesses, this could be groundbreaking.
Emotion Mapping: Detecting the tone and sentiment behind words to provide context to transcriptions.
Voice Commands: Integrating voice commands for a hands-free experience, enhancing user convenience manifold.
SpeechFlow epitomizes the fusion of innovation and utility. In the vast ocean of speech to text apps, it sails distinct, guided by the twin stars of user-centricity and technological prowess. As businesses, institutions, and individuals seek efficiency in communication, tools like SpeechFlow aren't just options; they're necessities.
If you're on the fence about adopting a speech to text app, consider this: in a world where time is currency, can you afford manual transcriptions? Make the shift, embrace SpeechFlow, and let technology amplify your words.
1 note
·
View note
Text
Bossware is unfair (in the legal sense, too)
You can get into a lot of trouble by assuming that rich people know what they're doing. For example, might assume that ad-tech works – bypassing peoples' critical faculties, reaching inside their minds and brainwashing them with Big Data insights, because if that's not what's happening, then why would rich people pour billions into those ads?
https://pluralistic.net/2020/12/06/surveillance-tulip-bulbs/#adtech-bubble
You might assume that private equity looters make their investors rich, because otherwise, why would rich people hand over trillions for them to play with?
https://thenextrecession.wordpress.com/2024/11/19/private-equity-vampire-capital/
The truth is, rich people are suckers like the rest of us. If anything, succeeding once or twice makes you an even bigger mark, with a sense of your own infallibility that inflates to fill the bubble your yes-men seal you inside of.
Rich people fall for scams just like you and me. Anyone can be a mark. I was:
https://pluralistic.net/2024/02/05/cyber-dunning-kruger/#swiss-cheese-security
But though rich people can fall for scams the same way you and I do, the way those scams play out is very different when the marks are wealthy. As Keynes had it, "The market can remain irrational longer than you can remain solvent." When the marks are rich (or worse, super-rich), they can be played for much longer before they go bust, creating the appearance of solidity.
Noted Keynesian John Kenneth Galbraith had his own thoughts on this. Galbraith coined the term "bezzle" to describe "the magic interval when a confidence trickster knows he has the money he has appropriated but the victim does not yet understand that he has lost it." In that magic interval, everyone feels better off: the mark thinks he's up, and the con artist knows he's up.
Rich marks have looong bezzles. Empirically incorrect ideas grounded in the most outrageous superstition and junk science can take over whole sections of your life, simply because a rich person – or rich people – are convinced that they're good for you.
Take "scientific management." In the early 20th century, the con artist Frederick Taylor convinced rich industrialists that he could increase their workers' productivity through a kind of caliper-and-stopwatch driven choreographry:
https://pluralistic.net/2022/08/21/great-taylors-ghost/#solidarity-or-bust
Taylor and his army of labcoated sadists perched at the elbows of factory workers (whom Taylor referred to as "stupid," "mentally sluggish," and as "an ox") and scripted their motions to a fare-the-well, transforming their work into a kind of kabuki of obedience. They weren't more efficient, but they looked smart, like obedient robots, and this made their bosses happy. The bosses shelled out fortunes for Taylor's services, even though the workers who followed his prescriptions were less efficient and generated fewer profits. Bosses were so dazzled by the spectacle of a factory floor of crisply moving people interfacing with crisply working machines that they failed to understand that they were losing money on the whole business.
To the extent they noticed that their revenues were declining after implementing Taylorism, they assumed that this was because they needed more scientific management. Taylor had a sweet con: the worse his advice performed, the more reasons their were to pay him for more advice.
Taylorism is a perfect con to run on the wealthy and powerful. It feeds into their prejudice and mistrust of their workers, and into their misplaced confidence in their own ability to understand their workers' jobs better than their workers do. There's always a long dollar to be made playing the "scientific management" con.
Today, there's an app for that. "Bossware" is a class of technology that monitors and disciplines workers, and it was supercharged by the pandemic and the rise of work-from-home. Combine bossware with work-from-home and your boss gets to control your life even when in your own place – "work from home" becomes "live at work":
https://pluralistic.net/2021/02/24/gwb-rumsfeld-monsters/#bossware
Gig workers are at the white-hot center of bossware. Gig work promises "be your own boss," but bossware puts a Taylorist caliper wielder into your phone, monitoring and disciplining you as you drive your wn car around delivering parcels or picking up passengers.
In automation terms, a worker hitched to an app this way is a "reverse centaur." Automation theorists call a human augmented by a machine a "centaur" – a human head supported by a machine's tireless and strong body. A "reverse centaur" is a machine augmented by a human – like the Amazon delivery driver whose app goads them to make inhuman delivery quotas while punishing them for looking in the "wrong" direction or even singing along with the radio:
https://pluralistic.net/2024/08/02/despotism-on-demand/#virtual-whips
Bossware pre-dates the current AI bubble, but AI mania has supercharged it. AI pumpers insist that AI can do things it positively cannot do – rolling out an "autonomous robot" that turns out to be a guy in a robot suit, say – and rich people are groomed to buy the services of "AI-powered" bossware:
https://pluralistic.net/2024/01/29/pay-no-attention/#to-the-little-man-behind-the-curtain
For an AI scammer like Elon Musk or Sam Altman, the fact that an AI can't do your job is irrelevant. From a business perspective, the only thing that matters is whether a salesperson can convince your boss that an AI can do your job – whether or not that's true:
https://pluralistic.net/2024/07/25/accountability-sinks/#work-harder-not-smarter
The fact that AI can't do your job, but that your boss can be convinced to fire you and replace you with the AI that can't do your job, is the central fact of the 21st century labor market. AI has created a world of "algorithmic management" where humans are demoted to reverse centaurs, monitored and bossed about by an app.
The techbro's overwhelming conceit is that nothing is a crime, so long as you do it with an app. Just as fintech is designed to be a bank that's exempt from banking regulations, the gig economy is meant to be a workplace that's exempt from labor law. But this wheeze is transparent, and easily pierced by enforcers, so long as those enforcers want to do their jobs. One such enforcer is Alvaro Bedoya, an FTC commissioner with a keen interest in antitrust's relationship to labor protection.
Bedoya understands that antitrust has a checkered history when it comes to labor. As he's written, the history of antitrust is a series of incidents in which Congress revised the law to make it clear that forming a union was not the same thing as forming a cartel, only to be ignored by boss-friendly judges:
https://pluralistic.net/2023/04/14/aiming-at-dollars/#not-men
Bedoya is no mere historian. He's an FTC Commissioner, one of the most powerful regulators in the world, and he's profoundly interested in using that power to help workers, especially gig workers, whose misery starts with systemic, wide-scale misclassification as contractors:
https://pluralistic.net/2024/02/02/upward-redistribution/
In a new speech to NYU's Wagner School of Public Service, Bedoya argues that the FTC's existing authority allows it to crack down on algorithmic management – that is, algorithmic management is illegal, even if you break the law with an app:
https://www.ftc.gov/system/files/ftc_gov/pdf/bedoya-remarks-unfairness-in-workplace-surveillance-and-automated-management.pdf
Bedoya starts with a delightful analogy to The Hawtch-Hawtch, a mythical town from a Dr Seuss poem. The Hawtch-Hawtch economy is based on beekeeping, and the Hawtchers develop an overwhelming obsession with their bee's laziness, and determine to wring more work (and more honey) out of him. So they appoint a "bee-watcher." But the bee doesn't produce any more honey, which leads the Hawtchers to suspect their bee-watcher might be sleeping on the job, so they hire a bee-watcher-watcher. When that doesn't work, they hire a bee-watcher-watcher-watcher, and so on and on.
For gig workers, it's bee-watchers all the way down. Call center workers are subjected to "AI" video monitoring, and "AI" voice monitoring that purports to measure their empathy. Another AI times their calls. Two more AIs analyze the "sentiment" of the calls and the success of workers in meeting arbitrary metrics. On average, a call-center worker is subjected to five forms of bossware, which stand at their shoulders, marking them down and brooking no debate.
For example, when an experienced call center operator fielded a call from a customer with a flooded house who wanted to know why no one from her boss's repair plan system had come out to address the flooding, the operator was punished by the AI for failing to try to sell the customer a repair plan. There was no way for the operator to protest that the customer had a repair plan already, and had called to complain about it.
Workers report being sickened by this kind of surveillance, literally – stressed to the point of nausea and insomnia. Ironically, one of the most pervasive sources of automation-driven sickness are the "AI wellness" apps that bosses are sold by AI hucksters:
https://pluralistic.net/2024/03/15/wellness-taylorism/#sick-of-spying
The FTC has broad authority to block "unfair trade practices," and Bedoya builds the case that this is an unfair trade practice. Proving an unfair trade practice is a three-part test: a practice is unfair if it causes "substantial injury," can't be "reasonably avoided," and isn't outweighed by a "countervailing benefit." In his speech, Bedoya makes the case that algorithmic management satisfies all three steps and is thus illegal.
On the question of "substantial injury," Bedoya describes the workday of warehouse workers working for ecommerce sites. He describes one worker who is monitored by an AI that requires him to pick and drop an object off a moving belt every 10 seconds, for ten hours per day. The worker's performance is tracked by a leaderboard, and supervisors punish and scold workers who don't make quota, and the algorithm auto-fires if you fail to meet it.
Under those conditions, it was only a matter of time until the worker experienced injuries to two of his discs and was permanently disabled, with the company being found 100% responsible for this injury. OSHA found a "direct connection" between the algorithm and the injury. No wonder warehouses sport vending machines that sell painkillers rather than sodas. It's clear that algorithmic management leads to "substantial injury."
What about "reasonably avoidable?" Can workers avoid the harms of algorithmic management? Bedoya describes the experience of NYC rideshare drivers who attended a round-table with him. The drivers describe logging tens of thousands of successful rides for the apps they work for, on promise of "being their own boss." But then the apps start randomly suspending them, telling them they aren't eligible to book a ride for hours at a time, sending them across town to serve an underserved area and still suspending them. Drivers who stop for coffee or a pee are locked out of the apps for hours as punishment, and so drive 12-hour shifts without a single break, in hopes of pleasing the inscrutable, high-handed app.
All this, as drivers' pay is falling and their credit card debts are mounting. No one will explain to drivers how their pay is determined, though the legal scholar Veena Dubal's work on "algorithmic wage discrimination" reveals that rideshare apps temporarily increase the pay of drivers who refuse rides, only to lower it again once they're back behind the wheel:
https://pluralistic.net/2023/04/12/algorithmic-wage-discrimination/#fishers-of-men
This is like the pit boss who gives a losing gambler some freebies to lure them back to the table, over and over, until they're broke. No wonder they call this a "casino mechanic." There's only two major rideshare apps, and they both use the same high-handed tactics. For Bedoya, this satisfies the second test for an "unfair practice" – it can't be reasonably avoided. If you drive rideshare, you're trapped by the harmful conduct.
The final prong of the "unfair practice" test is whether the conduct has "countervailing value" that makes up for this harm.
To address this, Bedoya goes back to the call center, where operators' performance is assessed by "Speech Emotion Recognition" algorithms, a psuedoscientific hoax that purports to be able to determine your emotions from your voice. These SERs don't work – for example, they might interpret a customer's laughter as anger. But they fail differently for different kinds of workers: workers with accents – from the American south, or the Philippines – attract more disapprobation from the AI. Half of all call center workers are monitored by SERs, and a quarter of workers have SERs scoring them "constantly."
Bossware AIs also produce transcripts of these workers' calls, but workers with accents find them "riddled with errors." These are consequential errors, since their bosses assess their performance based on the transcripts, and yet another AI produces automated work scores based on them.
In other words, algorithmic management is a procession of bee-watchers, bee-watcher-watchers, and bee-watcher-watcher-watchers, stretching to infinity. It's junk science. It's not producing better call center workers. It's producing arbitrary punishments, often against the best workers in the call center.
There is no "countervailing benefit" to offset the unavoidable substantial injury of life under algorithmic management. In other words, algorithmic management fails all three prongs of the "unfair practice" test, and it's illegal.
What should we do about it? Bedoya builds the case for the FTC acting on workers' behalf under its "unfair practice" authority, but he also points out that the lack of worker privacy is at the root of this hellscape of algorithmic management.
He's right. The last major update Congress made to US privacy law was in 1988, when they banned video-store clerks from telling the newspapers which VHS cassettes you rented. The US is long overdue for a new privacy regime, and workers under algorithmic management are part of a broad coalition that's closer than ever to making that happen:
https://pluralistic.net/2023/12/06/privacy-first/#but-not-just-privacy
Workers should have the right to know which of their data is being collected, who it's being shared by, and how it's being used. We all should have that right. That's what the actors' strike was partly motivated by: actors who were being ordered to wear mocap suits to produce data that could be used to produce a digital double of them, "training their replacement," but the replacement was a deepfake.
With a Trump administration on the horizon, the future of the FTC is in doubt. But the coalition for a new privacy law includes many of Trumpland's most powerful blocs – like Jan 6 rioters whose location was swept up by Google and handed over to the FBI. A strong privacy law would protect their Fourth Amendment rights – but also the rights of BLM protesters who experienced this far more often, and with far worse consequences, than the insurrectionists.
The "we do it with an app, so it's not illegal" ruse is wearing thinner by the day. When you have a boss for an app, your real boss gets an accountability sink, a convenient scapegoat that can be blamed for your misery.
The fact that this makes you worse at your job, that it loses your boss money, is no guarantee that you will be spared. Rich people make great marks, and they can remain irrational longer than you can remain solvent. Markets won't solve this one – but worker power can.
Image: Cryteria (modified) https://commons.wikimedia.org/wiki/File:HAL9000.svg
CC BY 3.0 https://creativecommons.org/licenses/by/3.0/deed.en
#pluralistic#alvaro bedoya#ftc#workers#algorithmic management#veena dubal#bossware#taylorism#neotaylorism#snake oil#dr seuss#ai#sentiment analysis#digital phrenology#speech emotion recognition#shitty technology adoption curve
2K notes
·
View notes