• 0 Posts
  • 63 Comments
Joined 2Y ago
cake
Cake day: Jun 16, 2023

help-circle
rss

Why discs instead of cartridges, which are currently the superior physical option? I personally try to buy physical whenever possible, because I don’t trust companies to not ban my account and flush hundreds of dollars of games down the toilet, and it generally feels better to have just that little extra bit more ownership over my own property.


I hope my reply didn’t come off as too caustic - I thought your reply with an open request for discussion was refreshing regardless of the common misconception. You’re not bad for being wrong, and I do enjoy sperging about these things. I didn’t intend to demean you, just in case it came across like that (if not just ignore this - I guess I’m overthinking it 🤔).


Let me flip it around again - humans regularly “hallucinate”, it’s just not something we recognize as such. There’s neuro-atypical hallucinations, yes, but there’s also misperceptions, misunderstandings, brain farts, and “glitches” which regularly occur in healthy cognition, and we have an entire rest of the brain to prevent those. LLMs are most comparable to “broca’s area”, which neurological case studies suggest naturally produces a stream of nonsense (see: split brain patients explaining the actions of their mute half). It’s the rest of our “cognitive architecture” which conditions that raw language model to remain self-consistent and form a coherent notion of self. Honestly this discussion on “conceptualization” is poorly conceived because it’s unfalsifiable and says nothing about the practical applications. Why do I care if the LLM can conceptualize if it does whatever subset of conceptualization I need to complete a natural language task?

AI is being super overhyped right now, which is unfortunate because it really is borderline miraculous, yet somehow they’ve overdone it. Emergent properties are empirical observations of behaviors they’re able to at least semi-consistently demonstrate - where it becomes “eye of the beholder” is when we dither on about psychology and philosophy about whether or not they’re some kind of “conscious” - I would argue they aren’t, and the architecture makes that impossible without external aid, but “conscious(ness)” is such a broad term that it barely has a definition at all. I guess to speedrun the overhype misinformation I see:

  • “They just predict one token at a time” is reductive and misleading even though it’s technically true - the loss function for language modeling inevitably requires learning abstract semantic operations. For instance, to complete “The capital of France is” a language model must in some way “know” about countries, cities, and the ontology of France.
  • “It’s just a chatbot” - ChatGPT is a chatbot, GPT-4 is a language model. Language models model how the likelihood of words and language changes over time. When I said “causal” before, this is an arbitrary restriction of the math such that the model only predicts the “next” word. If you remove this restriction, you can get it a sentence with a hole in it and it’ll tell you what words are most likely to be in that hole. You can think of it as being like a physics model, which describes how objects change over time. Putting these into a “generative” context allows you to extract latent semantic information generalized from the training corpus, including higher-order relationships. tl;dr “chatbot” is the first and least interesting application - anything which relates to “understanding” natural language is a potential application.
  • “Hallucinations show that they’re broken” - Hallucinations are actually what you’d expect from these sorts of models. If I had to broadly class the sorts of hallucinations I see, they would be:
    1. Model inaccuracy - Inevitable, but not the only reason. Essentially it failed to generalize in that specific way, like SD and hands.
    2. Unlikely sampling - It’s possible the code which picks the next word given the probability distribution accidentally picks one (or a series) with a very low chance. When this happens, the LLM has no way to “undo” that, which puts it in a very weird position where it has to keep predicting but it’s already in a space that shouldn’t really be possible. There are actually some papers which attempt to correct that, like adding an “undo token” (unfortunately can’t find the paper) or detecting OOD conditions
    3. Extrapolation - Especially for the earlier models with small context windows, if it needs information which is now outside that window it’s still modeling language, just without the necessary context. Without this context, it will instead pick one at random and talk about something unrelated. Compare this to eg dementia patients.
    4. Imagination - When you give it some kind of placeholder, like “<…>”, “etc etc etc” or “## code here ##”, most text in the training data like that will continue as if there was information in that place. Lacking context, just like with “extrapolation”, it picks one at random. You can mitigate this somewhat by telling it to only respond to things that are literally in the text, and GPT-4 doesn’t seem to have this problem much anymore, probably from the RLHF.
    5. Priming - If you prompt the LLM authoritatively enough, eg “find me a case that proves X” which implies such a case exists, if it doesn’t know of any such case, it will create one at random. Essentially, it’s saying “if there was a case that proved X it would look like this”. This is actually useful when properly constrained, eg if you want it to recursively generate code it might use an undefined function that it “wishes” existed.
  • “GPT-5 could be roko’s basilisk!” - No. This architecture is fundamentally incapable of iterative thought processes, for it to develop those itself would require trillions more parameters, if it’s even possible. What’s more, LLMs aren’t utility-maximizers or reinforcement learning agents like we thought AGI would be; they do whatever you ask and have no will or desires of their own. There’s almost 0 chance this kind of model would go rogue, offset only slightly by people using RLHF but that’s human-oriented so the worst you get is the model catering to humans being dumb.
  • “They tek er jerbs!” - Yes, but not because they’re “as good as humans” - they are better when given a specific task to narrowly focus on. The models are general, but they need to be told exactly what to do, which makes them excellent for capitalism’s style of alienated labor. I would argue this is actually be desirable if working wasn’t tied to people’s privilege to continue living - no living human should have to flip burgers when a robot can do it better, otherwise you’re treating the human like a robot.

I’ll add more if I see or think of any. And if you have any specific questions, I’d be happy to answer. Also I should note, I’m of course using a lot of anthropomorphizing language here but it’s the closest we have to describing these concepts. They’re not human, and while they may have comparable behaviors in isolation, you can’t accurately generalize all human behaviors and their interactions onto the models. Even if they were AGI or artificial people, they would “think” in fundamentally different ways.

If you want a more approachable but knowledgeable discussion on LLMs and their capabilities, I would recommend a youtuber named Dave Shapiro. Very interesting ideas, he gets a bit far into hype and futurism but those are more or less contained within their own videos.`


How about this, then. You’ve proposed that LLMs are not capable of conceptualizing, while I propose that the specifics of the internals don’t matter in this case because LLMs are made of dozens of layers which can easily explain higher orders of abstraction, and they exist as black boxes beyond the mechanics of the model. For the record, I personally know the mathematics and mechanics of how they work as I’ve written my own implementations (and I can answer any specific questions you might have). Is there an experiment you can propose which would falsify your assertion that LLMs cannot conceptualize? I’m taking for granted that they can as the null hypothesis because they can readily produce outputs that appear for all intents and purposes to conceptualize.


Did you try this with an LLM? Because GPT-4 analyzes it exactly the same way you did and then some:

List everything wrong in the sentence “The king of Italy is completely bald because his hair is currently naturally green.”

The sentence “The king of Italy is completely bald because his hair is currently naturally green” contains several issues:

  1. Factual Inaccuracy: As of my last update in September 2021, Italy is a republic and does not have a king.
  2. Logical Contradiction: The sentence states that the king is “completely bald” but then goes on to describe the color of his hair. If he is bald, he wouldn’t have hair to describe.
  3. Causality Issue: The sentence implies that the king is bald because his hair is green, which doesn’t make logical sense. The color of one’s hair doesn’t cause baldness.
  4. Ambiguity in “Naturally”: The word “naturally” could be interpreted in multiple ways. It could mean that the green color is natural to him (not dyed), or it could mean that the hair turned green on its own. The ambiguity could lead to confusion.
  5. Tense Mismatch: The sentence uses “is completely bald” (present tense) and “is currently naturally green” (also present tense) for the hair, which is contradictory.
  6. Redundancy: The word “currently” may be considered redundant if the sentence is understood to be describing a present condition.
  7. Clarity: The sentence overall is confusing and lacks clarity due to the contradictions and ambiguities mentioned.

In summary, the sentence has issues ranging from factual inaccuracies to logical contradictions and ambiguities.


Part of what is surprising about LLMs is they have emergent properties you wouldn’t expect from them being autocomplete on steroids. As it turns out, reducing the loss function for natural language readily generalizes to higher-order abstraction and conceptualization. There do need to be additional layers, for instance allowing an internal monologue, the ability to self-censor or self-correct, and mitigation for low-probability sampling (all of these being inherent limitations with the architecture), but apparently conceptualization is less special than we’d like to think.


LLMs are not created to chat, they’re literally what the name says - language models. They are very complex statistical models of the joint causal probability of all possible words given the previous words in the context window. There’s a common misconception that they’re “made for chat” by the wider public because ChatGPT was the first “killer application”, but they are much more general than that. What’s so profound about LLMs to AI and NLP engineers is that they’re general purpose. That is, given the right framework they can be used to complete any task expressible in natural language. It’s hard to convey to people just how powerful that is, and I haven’t seen software engineers really figure this out yet either. As an example I keep going back to, I made a library to create “semantic functions” in Python which look like this:

@semantic
def list_people(text) -> list[str]:
    '''List the people mentioned in the given text.'''

That is the entire function, expressed in the docstring. 10 months ago, this would’ve been literally impossible. I could approximate it with thousands of lines of code using SpaCy and other NLP libraries to do NER, maybe a dictionary of known names with fuzzy matching, some heuristics to rule out city names or more advanced sentence structure parsing for false positives, but the result would be guaranteed to be worse for significantly more effort. Here, I just tell the AI to do it and it… does. Just like that. But you can’t hype up an algorithm that does boring stuff like NLP, so people focus on the danger of AI (which is real, but laymen and news focus on the wrong things), how it’s going to take everyone’s jobs (it will, but that’s a problem with our system which equates having a job to being allowed to live), how it’s super-intelligent, etc. It’s all the business logic and doing things that are hard to program but easy to describe that will really show off its power.


Any law with fines as a punishment is a law only the poor have to abide by, at the very least. A lot of laws which are explicitly meant to target wealthy entities like corporations or billionaires have their fines set comically low. Think car manufacturers regularly calculating that it’s cheaper to pay fines than it is to recall thousands of cars with deadly manufacturing faults.


To be Fair and Balanced ®™ to him, popular media really fed into his ego as the “real life Iron Man”, from Tony Stark literally addressing him in a Marvel movie, Star Trek listing him among the great geniuses (lmfao), etc people with personality disorders like definitionally-all-billionaires are going to get caught up in their own farts hype.


My initial concern is that internet access is mostly considered a utility which at least the UN considers a human right. “If you can take away a right, it’s not a right, it’s a privilege.” That being said, “rights” are a social construct anyway which we regularly violate for “good” reasons, eg prison violates one’s right to freedom of movement, and as an institution they weirdly reinforce The State’s monopoly on violence and arbitration of who qualifies as “human” - in a twisted sort of way, prisoners are rendered “less human” by the state by their rights being taken away. Maybe it would be better to consider taking away internet access via ISPs sort of the moral equivalent of turning off someone’s water if they’re using it to poison the town well? If you abuse your right, you don’t get to use it anymore as a defensive mechanism for everyone else’s rights, ala sex offenders being put on a list which violates their privacy for better protection of more vulnerable groups.


How did you come to all of these different rules for managing eg the lights? Did you have to program them all manually?



Cults of personality tie your identity to the target, such that at some point they could literally shoot someone in the streets and it’ll still be excused. To do otherwise would break your ego which most people aren’t willing or prepared to do.


That sounds like a pain - surely there’s a shorter length that’s still strong enough that it can’t be cracked in a trillion years?


I don’t think crypto is dead, I think fintech’s usage of crypto is dead. They came in and ruined what could’ve been a unique and revolutionary idea by making prospective currencies into speculative assets. We might see it reemerge in 10 years with capitalists and right-libertarians staying as far away as possible because they (hopefully) learned their lesson. The point of a currency is as a medium to store and exchange value, but the initial spike in fiat value turning 12 bitcoins from $0.12 to $12000 and attracted investors, get rich schemers, and scam artists (but I repeat myself). It doesn’t help that it was designed to have negative inflation, so people were incentivized to hoard and bet on the market’s volatility, and there was no organization dedicated to keeping it stable like the Fed. Then alternatives to PoW like PoS came about which further incentivized hoarding and centralization (you lose stake if you spend, so don’t spend).

What people miss out on with all the hate about crypto (though the culture around it deserves a lot) is that the technology itself is potentially incredibly useful. Bitcoin was a first crack at the “Byzantine General’s Problem”, essentially how to coordinate a totally trustless and decentralized p2p network. Tying it to money was an easy way to get an incentive structure, but for applications like FileCoin it could just as easily allow for abstracted tit-for-tat services (in their case, “you host my file and I’ll host yours”). Stuff like NFTs have less obvious benefit, but the technology itself is a neutral tool that could see some legitimate use 20 years in the future like, say, a decentralized DNS system where you need a DHT mapping domains to IPNS hashes with some concept of ownership. Collectible monkeys are not and never were a legitimate use-case, at least not at that price point.


First I’d like to be a little pedantic and say LLMs are not chatbots. ChatGPT is a chatbot - LLMs are language models which can be used to build chatbots. They are models (like a physics model) of language, describing the causal joint probability distribution of language. ChatGPT only acts like an agent because OpenAI spent a lot of time retraining a foundation model (which has no such agent-like behavior) to model “language” as expressed by an individual. Then, they put it into a chatbot “cognitive architecture” which feeds it a truncated chat log. This is why the smaller models when improperly constrained may start typing as if they were you - they have no inherent distinction between the chatbot and yourself. LLMs are a lot more like broca’s area than a person or even chatbot.

When I say they’re “general purpose”, this is more or less an emergent feature of language, which encodes some abstract sense of problem solving and tool use. Take the library I wrote to create “semantic functions” from natural language tasks - one of the examples I keep going to in order to demonstrate the usefulness is

@semantic
def list_people(text) -> list[str]:
    '''List the people mentioned in the given text.'''

a year ago, this would’ve been literally impossible. I could approximate it with thousands of lines of code using SpaCy and other NLP libraries to do NER, maybe a massive dictionary of known names with fuzzy matching, some heuristics to rule out city names or more advanced sentence structure parsing for false positives, but the result would be guaranteed to be worse for significantly more effort. With LLMs, I just tell the AI to do it and it… does. Just like that. I can ask it to do anything and it will, within reason and with proper constraints.

GPT-3 was the first generation of this technology and it was already miraculous for someone like me who’s been following the AI field for 10+ years. If you try GPT-4, it’s at least 10x subjectively more intelligent than ChatGPT/GPT-3.5. It costs $20/mo, but it’s also been irreplaceable for me for a wide variety of tasks - Linux troubleshooting, bash commands, ducking coding, random questions too complex to google, “what was that thing called again”, sensitivity reader, interactively exploring options to achieve a task (eg note-taking, SMTP, self-hosting, SSI/clustered computing), teaching me the basics of a topic so I can do further research, etc. I essentially use it as an extra brain lobe that knows everything as long as I remind it about what it knows.

While LLMs are not people, or even “agents”, they are “inference engines” which can serve as building blocks to construct an “artificial person” or some gradiation therein. In the near future, I’m going to experiment with creating a cognitive architecture to start approaching it - long term memory, associative memory, internal thoughts, dossier curation, tool use via endpoints, etc so that eventually I have what Alexa should’ve been, hosted locally. That possibility is probably what techbros are freaking out about, they’re just uninformed about the technology and think GPT-4 is already that, or that GPT-5 will be (it won’t). But please don’t buy into the anti-hype, it robs you of the opportunity to explore the technology and could blindside you when it becomes more pervasive.

What would AI have to do to qualify as “capable of some interesting new kind of NLP or can create something entirely new”? From where I stand, that’s exactly what generative AI is? And if it isn’t, I’m not sure what even could qualify unless you used necromancy to put a ghost in a machine…


It sounds simple but data conditioning like that is how you get scunthorpe being blacklisted, and the effects on the model even if perfectly executed are unpredictable. It could get into issues of “race blindness”, where the model has no idea these words are bad and as a result is incapable of accommodating humans when the topic comes up. Suppose in 5 years there’s a therapist AI (not ideal but mental health is horribly understaffed and most people can’t afford a PhD therapist) that gets a client who is upset because they were called a f**got at school, it would have none of the cultural context that would be required to help.

Techniques like “constitutional AI” and RLHF developed after the foundation models really are the best approach for these, as they allow you to get an unbiased view of a very biased culture, then shape the model’s attitudes towards that afterwards.


I like to say “they’re consistently biased”. They might have racial or misogynistic biases from the culture they ingested, but they’ll always express those biases in a consistent way. Meanwhile, humans can become more or less biased depending on whether they’ve eaten lunch yet or woke up tilted.


It makes me really sad because the techbros are a cargo cult with no understanding of the technology, and the anti-AI crowd is an overcorrection to the techbro hype train which overemphasizes the limitations without acknowledging that this is the first generation of general-purpose AI (distinct from AGI). Meanwhile I, someone who’s followed the AI field for 10 years waiting for this day, am overjoyed by the near miracle that is a general-purpose model that can handle any task you throw at it and simultaneously worried this yet-another-culture-war will distract people screeching about utopia vs skynet while capitalists use the technology to lay everyone off and send us into a neotechnofeudal society where labor has no power instead of the socialist utopia where work is optional we deserve…


I’m not sure it should be illegal, since it can be legitimately useful, but maybe something like “inconclusive evidence that isn’t enough to grant a warrant”. That way, you can get a list of potential suspects but you don’t end up violating rights by issuing undue warrants.


Facial recognition should always be a clue, never evidence. It should have the same weight as eyewitness testimony, because the algorithms will always have personal biases from their dataset. Otherwise, we risk lawyers saying stuff like “the algorithm gives a 99% confidence this is you” and the jury thinks this is some objective measure. Meanwhile, the algorithm only has 1% BIPOC in its dataset and labels with high confidence lots of them as being the same person.

Reminds me of the movie Anon, with this jaw-dropping quote at the end: “It’s not that I have something to hide. I have nothing I want to show you.”