• gamer@lemm.ee
    link
    fedilink
    arrow-up
    12
    arrow-down
    1
    ·
    5 hours ago

    Damn, I had no idea the Game UI DB guy was so based. Huge respect from me.

    • ArchRecord@lemm.ee
      link
      fedilink
      English
      arrow-up
      10
      ·
      4 hours ago

      It’s a person who runs a database of game UI’s being contacted by people who want to train AI models on all of the data en masse.

  • Vanilla_PuddinFudge@infosec.pub
    link
    fedilink
    arrow-up
    60
    arrow-down
    2
    ·
    10 hours ago

    Then there’s always that one guy who’s like “what about memes?”

    MSpaint memes are waaaaay funnier than Ai memes, if only due to being a little bit ass.

  • Tartas1995@discuss.tchncs.de
    link
    fedilink
    arrow-up
    61
    arrow-down
    5
    ·
    13 hours ago

    Some many in these comments are like “what about the ethical source data ones?”

    Which ones? Name one.

    None of the big ones are. Wtf is ethically sourced? E.g. Ebay wants to collect data for ai shit. My mom has an account, and she could opt out of them using her data but when I told her about it, she told me that she didn’t understand. And she moved on. She just didn’t understand what the fuck they are doing and why she might should care. But I guess it is “ethically” sourced as they kinda asked by making it opt out, I guess.

    That surely is very ethical and you can not critic it for it… As we all know, an 50yo adult fucking a 14yo would also be totally cool as long as the 14yo doesn’t say no. Right? That is how our moral compass work. /S

    Fucking disgusting. All of you tech bro complain about people not getting ai or tech in general and then talk about ethically sourced data. I spit on you.

    I love IT, I work in it and I live it, but I have morals and you could too

      • cybersin@lemm.ee
        link
        fedilink
        arrow-up
        6
        arrow-down
        1
        ·
        5 hours ago

        Yeah, except royalties in music are almost always a joke. Those artists are going to make much less off their AI voice than if they actually appeared in studio and the end product is going to be worse. If AI cost the same or more, there would be no market for it. Relevant story about Hollywood actors who sold AI likenesses.

        Even if it was actually “ethically trained”, the end result is still horrible.

        Also, paying to have an AI Snoop Dogg in your song is the lamest shit I’ve ever heard.

      • coolkicks@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        3
        ·
        4 hours ago

        That AI was trained on absolute mountains of data that wasn’t ethically gained, though.

        Just because an emerald ring is assembled by a local jeweler doesn’t mean the diamond didn’t come from slave labor in South Africa.

        • ArchRecord@lemm.ee
          link
          fedilink
          English
          arrow-up
          4
          ·
          4 hours ago

          Voice Swap was not trained on any data that wasn’t “ethically gained.”

          Read the bottom of their FAQ that lists the exact databases in question.

          The couple of datasets they used on top of all the data they directly pay artists to consensually provide have permissive licenses that only require attribution for use, and gathered their information directly from a group of willing, consenting participants.

          They are quite literally the exception to the rule of companies claiming they’re ethical, then using non-ethically sourced data as a base for their models.

    • suy@programming.dev
      link
      fedilink
      arrow-up
      11
      arrow-down
      6
      ·
      10 hours ago

      Which ones? Name one.

      What’s wrong with what Pleias or AllenAI are doing? Those are using only data on the public domain or suitably licensed, and are not burning tons of watts on the process. They release everything as open source. For real. Public everything. Not the shit that Meta is doing, or the weights-only DeepSeek.

      It’s incredible seeing this shit over and over, specially in a place like Lemmy, where the people are supposed to be thinking outside the box, and being used to stuff which is less mainstream, like Linux, or, well, the fucking fediverse.

      Imagine people saying “yeah, fuck operating systems and software” because their only experience has been Microsoft Windows. Yes, those companies/NGOs are not making the rounds on the news much, but they exist, the same way that Linux existed 20 years ago, and it was our daily driver.

      Do I hate OpenAI? Heck, yeah, of course I do. And the other big companies that are doing horrible things with AI. But I don’t hate all in AI because I happen to not be an ignorant that sees only the 99% of it.

      • Tartas1995@discuss.tchncs.de
        link
        fedilink
        arrow-up
        8
        arrow-down
        2
        ·
        edit-2
        5 hours ago

        AllenAi has datasets based on

        GitHub, reddit, Wikipedia and “web pages”.

        I wouldn’t call any of them ethically sourced.

        “Webpages” as it is vague as fuck and makes me question if they requested consent of the creators.

        “Gutenberg project” is the funniest tho.

        Writing GitHub, reddit and Wikipedia, tells be very clearly that they didn’t. They might asked the providers but that is not the creator. Whether or not the provider have a license for the data is irrelevant on a moral ground unless it was an opt-in for the creator. Also it has to be clearly communicated. Giving consent is not “not saying no”, it is a yes. Uninformed consent is not consent.

        When someone post on Reddit in 2005 and forgot their password, they can’t delete their content from it. They didn’t post it with the knowledge that it will be used for ai training. They didn’t consent to it.

        Gutenberg project… Dead author didn’t consent to their work being used to destroy a profession that they clearly loved.

        So I bothered to check out 1 dataset of the names that you dropped and it was unethical. I don’t understand why people don’t get it.

        What is wrong? That you think that they are ethical when the first dataset that I look at, already isn’t.

        • suy@programming.dev
          link
          fedilink
          arrow-up
          2
          arrow-down
          1
          ·
          2 hours ago

          I don’t know where you got that image from. AllenAI has many models, and the ones I’m looking at are not using those datasets at all.

          Anyway, your comments are quite telling.

          First, you pasted an image without alternative text, which it’s harmful for accessibility (a topic in which this kind of models can help, BTW, and it’s one of the obvious no-brainer uses in which they help society).

          Second, you think that you need consent for using works in the public domain. You are presenting the most dystopic view of copyright that I can think of.

          Even with copyright in full force, there is fair use. I don’t need your consent to feed your comment into a text to speech model, an automated translator, a spam classifier, or one of the many models that exist and that serve a legitimate purpose. The very image that you posted has very likely been fed into a classifier to discard that it’s CSAM.

          And third, the fact that you think that a simple deep learning model can do so much is, ironically, something that you share with the AI bros that think the shit that OpenAI is cooking will do so much. It won’t. The legitimate uses of this stuff, so far, are relevant, but quite less impactful than what you claimed. The “all you need is scale” people are scammers, and deserve all the hate and regulation, but you can’t get past those and see that the good stuff exists, and doesn’t get the press it deserves.

          • Tartas1995@discuss.tchncs.de
            link
            fedilink
            arrow-up
            1
            ·
            edit-2
            16 minutes ago

            https://allenai.org/dolma then you scroll down to “read dolma paper” and then click on it. This sends you to this site. https://www.semanticscholar.org/paper/Dolma%3A-an-Open-Corpus-of-Three-Trillion-Tokens-for-Soldaini-Kinney/ad1bb59e3e18a0dd8503c3961d6074f162baf710

            1. Funny how you speak about e.g. text to speech ai when I am talking about LLM and image generation AIs. It is almost as if you didn’t want to critic my point.
            2. It is funny how you use legal terms like copyright when I talk about morality. It is almost as if I don’t say that you shouldn’t be legally allowed to work with public domain Material but that you shouldn’t call it ethical when it is not. It is also funny how you say it is fair use. I invite you to turn the whole of Harry Potter from text to Speech and publish it. It is fair use, isn’t it? You know that you wouldn’t be in the right there. But again, this isn’t a legal argument, it is moral one.
            3. Who said, that I think it could replace writers or painters in quality or skill, I said it could ruin the economical validity of the profession. That is a very very different claim.

            I want to address your statement about my telling behavior. Sorry, you are right. I am sorry for the screen reader crowd. You all probably know that alt text could be misleading and that someone says that in the internet, isn’t a reliable source. So i hope you can forgive me as you did your own simple research into AllenAi anyway.

        • merari42@lemmy.world
          link
          fedilink
          arrow-up
          6
          arrow-down
          1
          ·
          4 hours ago

          We generally had the reasonable rule that property ends at dead. Intellectual property extending beyond the grave is corporatist 21st century bullshit. In the past all writing got quickly into the public domain like it should. Depending on country within in at least 25 years of the publishing date to the authors dead. Project Gutenberg reflects the law and reasonable practice to allow writing to go into the public domain.

          • Tartas1995@discuss.tchncs.de
            link
            fedilink
            arrow-up
            4
            arrow-down
            1
            ·
            4 hours ago

            Good focus on 1 point, sadly bad point to focus on.

            What is lawful and legal, is not what is moral.

            The Holocaust was legal.

            Try again. Let’s start. Should the invention of ai have an influence on how we treat data? Is there a difference between reproducing a work after the author’s death and using possible millennia of public domain data to destroy the economical validity of a profession? If there is, should public domain law consider that? Has the general public discuss these points and come to a consensus? Has that consensus been put in law?

            No? Sounds like the law is not up to date to the tech. So not only is legal not Moral, legal isn’t up to date.

            You understand the point of public domain, right? You understand that even if you were right (you aren’t), that it would resolve the other issues, right?

      • SloganLessons@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        1
        ·
        9 hours ago

        It’s incredible seeing this shit over and over, specially in a place like Lemmy, where the people are supposed to be thinking outside the box, and being used to stuff which is less mainstream, like Linux, or, well, the fucking fediverse.

        Lemmy is just an opensource reddit, with all the pros and cons

        • wellheh@lemmy.sdf.org
          link
          fedilink
          arrow-up
          2
          ·
          3 hours ago

          It’s such a strange take, too. Like why do we have to include AI in our box if we fucking hate it?

    • Kusimulkku@lemm.ee
      link
      fedilink
      arrow-up
      7
      ·
      11 hours ago

      Mozilla’s Common Voice seems pretty cool, but I’m not sure if that counts.

      It’s fun to record the clips.

      • ArchRecord@lemm.ee
        link
        fedilink
        English
        arrow-up
        2
        ·
        4 hours ago

        I’ve contributed to labeling and scoring some of the Common Voice data before. Definitely a fun little thing to do when you have some free time.

        I was also pretty happy when I saw Open Assistant making a fully public, consensually contributed to database for text models, but they unfortunately shut down, and in the end there was only really enough data to fine-tune models rather than creating one from scratch.

    • Taleya@aussie.zone
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      1
      ·
      12 hours ago

      What the fuck data collected could ebay use to train AI? The fact people buy star trek figurines??

      • Tartas1995@discuss.tchncs.de
        link
        fedilink
        arrow-up
        4
        arrow-down
        1
        ·
        6 hours ago

        Thanks for making my point. People don’t understand and therefore can’t consent and therefore it isn’t ethically sourced data.

      • TheOakTree@lemm.ee
        link
        fedilink
        arrow-up
        12
        ·
        10 hours ago

        You could train it to analyze sales tactics for different categories of items or even for specific items, then offer the AI’s conclusions as an ‘AI assistant’ locked behind a paywall.

        Plenty of use cases for collecting e-commerce data.

      • BoulevardBlvd@lemmy.blahaj.zone
        link
        fedilink
        arrow-up
        1
        ·
        3 hours ago

        What’s an “untrained bot”? Did they code it from scratch themselves? I find it almost impossible to believe it wasn’t just a fork of an existing, unethical project but I’d love more detail

  • HappyTimeHarry@lemm.ee
    link
    fedilink
    English
    arrow-up
    67
    arrow-down
    1
    ·
    15 hours ago

    It seems we’ve come full circle with “copying is not theft”… I have to admit I’m really not against the technology in general, but the people who are currently in control of it are all just the absolute worst people who are the least deserving of control over such a thing.

    Is it hypocritical to think there should be rules for corporations that dont apply to real people? Like why is it the other way around and I can go to jail or get a fine for sharing the wrong files but some company does it and they just say its for the “common good” and they “couldnt make money if they had to follow the laws” and they get a fucking pass?

    • BigPotato@lemmy.world
      link
      fedilink
      arrow-up
      19
      ·
      10 hours ago

      Yeah, I’ve been a pirate for so long I have zero moral grounds to be against using copyrighted stuff for free…

      Except I’m not burning a small nations’ worth of energy to download a NoFX album and I’m not recreating that album and selling it to people when they ask for a copy of Heavy Petting Zoo (I’m just giving them the real songs). So, moral high ground regained?

    • cloud_herder@lemmy.world
      link
      fedilink
      arrow-up
      4
      ·
      9 hours ago

      Me either. Seems like it would be a really handy site to use if you were making your own game and wanted to see some examples or best practices.

      Had no idea it existed, I don’t make games though. But it’s always so cool to see something built that serves a unique niche I had never thought of before! Consulting used to be like for me but after a while, it was always the same kinds of business problems just a different flavor of organization.

    • Kusimulkku@lemm.ee
      link
      fedilink
      arrow-up
      11
      ·
      11 hours ago

      I mean it does show up on the feed as normal and sometimes people feel like it’s fine to give an differing perspective to such communities.

  • ZeroOne@lemmy.world
    link
    fedilink
    arrow-up
    2
    arrow-down
    24
    ·
    edit-2
    4 hours ago

    That guy needs help, seriously. But then, this is Fuck-AI

    This kind of deranged behaviour will only increase AI adoption not decrease it.

    Of course people will see this as an attack on Anti-AI people & not me being concerned. But at this point I’m not surprised

  • Guns0rWeD13@lemmy.world
    link
    fedilink
    arrow-up
    2
    arrow-down
    26
    ·
    9 hours ago

    oh, the luddites have their own instance now, huh? cat’s out of the bag, folks. deal with it.

  • Asswardbackaddict@lemmy.world
    link
    fedilink
    arrow-up
    9
    arrow-down
    24
    ·
    edit-2
    9 hours ago

    As an artist, all y’all need to chill. The problem is capitalism, and it’s not like artists make a living anyway. Democratizing art opens up a lot of possibilities, you technophobes.

      • Asswardbackaddict@lemmy.world
        link
        fedilink
        arrow-up
        2
        arrow-down
        4
        ·
        6 hours ago

        Easy. Don’t work a job or pay rent. Anarchism already exists. It just exists in the crannies (like right in front of you) where other domineering primates don’t beat you with sticks or boss you around. You don’t fix the system. You ignore it.

      • Asswardbackaddict@lemmy.world
        link
        fedilink
        arrow-up
        2
        arrow-down
        1
        ·
        6 hours ago

        The early 20th century? I’d say physical philosophy would beg to differ, and do you see how you just killed your own argument by citing a time period? I think ideas don’t have value and that intellectual property stifles innovation. You had me in the first half, where I assumed you meant that people don’t just intuit new ideas from nowhere, then you cited a date and lost me.

        • Guns0rWeD13@lemmy.world
          link
          fedilink
          arrow-up
          1
          arrow-down
          1
          ·
          5 hours ago

          do you see how you just killed your own argument by citing a time period?

          no. all art prior to that period was just refinement of forms that go back to pre-history. the 20th century introduced ‘modern art’, which basically solidified the idea that anything can be art.