scruiser

joined 2 years ago
[–] scruiser@awful.systems 7 points 1 day ago (2 children)

Gary Marcus has been a solid source of sneer material and debunking of LLM hype, but yeah, you're right. Gary Marcus has been taking victory laps over a bar set so so low by promptfarmers and promptfondlers. Also, side note, his negativity towards LLM hype shouldn't be misinterpreted as general skepticism towards all AI... in particular Gary Marcus is pretty optimistic about neurosymbolic hybrid approaches, it's just his predictions and hypothesizing are pretty reasonable and grounded relative to the sheer insanity of LLM hypsters.

Also, new possible source of sneers in the near future: Gary Marcus has made a lesswrong account and started directly engaging with them: https://www.lesswrong.com/posts/Q2PdrjowtXkYQ5whW/the-best-simple-argument-for-pausing-ai

Predicting in advance: Gary Marcus will be dragged down by lesswrong, not lesswrong dragged up towards sanity. He'll start to use lesswrong lingo and terminology and using P(some event) based on numbers pulled out of his ass. Maybe he'll even start to be "charitable" to meet their norms and avoid down votes (I hope not, his snark and contempt are both enjoyable and deserved, but I'm not optimistic based on how the skeptics and critics within lesswrong itself learn to temper and moderate their criticism within the site). Lesswrong will moderately upvote his posts when he is sufficiently deferential to their norms and window of acceptable ideas, but won't actually learn much from him.

[–] scruiser@awful.systems 13 points 2 days ago (5 children)

Unlike with coding, there are no simple “tests” to try out whether an AI’s answer is correct or not.

So for most actual practical software development, writing tests is in fact an entire job in and of itself and its a tricky one because covering even a fraction of the use cases and complexity the software will actually face when deployed is really hard. So simply letting the LLMs brute force trial-and-error their code through a bunch of tests won't actually get you good working code.

AlphaEvolve kind of did this, but it was testing very specific, well defined, well constrained algorithms that could have very specific evaluation written for them and it was using an evolutionary algorithm to guide the trial and error process. They don't say exactly in their paper, but that probably meant generating code hundreds or thousands or even tens of thousands of times to generate relatively short sections of code.

I've noticed a trend where people assume other fields have problems LLMs can handle, but the actually competent experts in that field know why LLMs fail at key pieces.

[–] scruiser@awful.systems 9 points 2 days ago (1 children)

Exactly. I would almost give the AI 2027 authors credit for committing to a hard date... except they already have a subtly hidden asterisk in the original AI 2027 noting some of the authors have longer timelines. And I've noticed lots of hand-wringing and but achkshuallies in their lesswrong comments about the difference between mode and median and mean dates and other excuses.

Like see this comment chain https://www.lesswrong.com/posts/5c5krDqGC5eEPDqZS/analyzing-a-critique-of-the-ai-2027-timeline-forecasts?commentId=2r8va889CXJkCsrqY :

My timelines move dup to median 2028 before we published AI 2027 actually, based on a variety of factors including iteratively updating our models. But it was too late to rewrite the whole thing to happen a year later, so we just published it anyway. I tweeted about this a while ago iirc.

...You got your AI 2027 reposted like a dozen times to /r/singularity, maybe many dozens of times total across Reddit. The fucking vice president has allegedly read your fiction project. And you couldn't be bothered to publish your best timeline?

So yeah, come 2028/2029, they already have a ready made set of excuse to backpedal and move back the doomsday prophecy.

[–] scruiser@awful.systems 11 points 2 days ago* (last edited 2 days ago) (5 children)

So two weeks ago I linked titotal's detailed breakdown of what is wrong with AI 2027's "model" (tldr; even accepting the line goes up premise of the whole thing, AI 2027's math was so bad that they made the line always asymptote to infinity in the near future regardless of inputs). Titotal went to pretty extreme lengths to meet the "charitability" norms of lesswrong, corresponding with one of the AI 2027 authors, carefully considering what they might have intended, responding to comments in detail and depth, and in general not simply mocking the entire exercise in intellectual masturbation and hype generation like it rightfully deserves.

But even with all that effort, someone still decided make an entire (long, obviously) post with a section dedicated to tone-policing titotal: https://thezvi.substack.com/p/analyzing-a-critique-of-the-ai-2027?open=false#%C2%A7the-headline-message-is-not-ideal (here is the lw link: https://www.lesswrong.com/posts/5c5krDqGC5eEPDqZS/analyzing-a-critique-of-the-ai-2027-timeline-forecasts)

Oh, and looking back at the comments on titotal's post... his detailed elaboration of some pretty egregious errors in AI 2027 didn't really change anyone's mind, at most moving them back a year to 2028.

So, morale of the story, lesswrongers and rationalist are in fact not worth the effort to talk to and we are right to mock them. The numbers they claim to use are pulled out of their asses to fit vibes they already feel.

And my choice for most sneerable line out of all the comments:

https://forum.effectivealtruism.org/posts/KgejNns3ojrvCfFbi/a-deep-critique-of-ai-2027-s-bad-timeline-models?commentId=XbPCQkgPmKYGJ4WTb

And I therefore am left wondering what less shoddy toy models I should be basing my life decisions on.

[–] scruiser@awful.systems 12 points 1 week ago* (last edited 1 week ago) (3 children)

Following up because the talk page keeps providing good material..

Hand of Lixue keeps trying to throw around the Wikipedia rules like the other editors haven't seen people try to weaponize the rules to push their views many times before.

Particularly for the unflattering descriptions I included, I made sure they reflect the general view in multiple sources, which is why they might have multiple citations attached. Unfortunately, that has now led to complaints about overcitation from @Hand of Lixue. You can't win with some people...

Looking back on the original lesswrong ~~brigade organizing~~ discussion of how to improve the wikipedia article, someone tried explaining to Habyrka the rules then and they were dismissive.

I don’t think it counts as canvassing in the relevant sense, as I didn’t express any specific opinion on how the article should be edited.

Yes Habyrka, because you clearly have such a good understanding of the Wikipedia rules and norms...

Also, heavily downvoted on the lesswrong discussion is someone suggesting Wikipedia is irrelevant because LLMs will soon be the standard for "access to ground truth". I guess even lesswrong knows that is bullshit.

[–] scruiser@awful.systems 12 points 1 week ago (7 children)

The wikipedia talk page is some solid sneering material. It's like Habryka and HandofLixue can't imagine any legitimate reason why Wikipedia has the norms it does, and they can't imagine how a neutral Wikipedian could come to write that article about lesswrong.

Eigenbra accurately calling them out...

"I also didn't call for any particular edits". You literally pointed to two sentences that you wanted edited.

Your twitter post also goes against Wikipedia practices by casting WP:ASPERSIONS. I can't speak for any of the other editors, but I can say I have never read nor edited RationalWiki, so you might be a little paranoid in that regard.

As to your question:

Was it intentional to try to pick a fight with Wikipedians?

It seems to be ignorance on Habyrka's part, but judging by the talk page, instead of acknowledging their ignorance of Wikipedia's reasonable policies, they seem to be doubling down.

[–] scruiser@awful.systems 7 points 1 week ago (1 children)

Also lol at the 2027 guys believing anything about how grok was created.

Judging by various comments the AI 2027 authors have made, sucking up to techbro side of the alt-right was in fact a major goal of AI 2027, and, worryingly they seem to have succeeded somewhat (allegedly JD Vance has read AI 2027) but lol at the notion they could ever talk any of the techbro billionaires into accepting any meaningful regulation. They still don't understand their doomerism is free marketing hype for the techbros, not anything any of them are actually treating as meaningfully real.

[–] scruiser@awful.systems 6 points 1 week ago

Yeah AI 2027's model fails back of the envelope sketches as soon as you try working out any features of it, which really draws into question the competency of it's authors and everyone that has signal boosted it. Like they could have easily generated the same crit-hype bullshit with "just" an exponential model, but for whatever reason they went with this model. (They had a target date they wanted to hit? They correctly realized adding in extraneous details would wow more of their audience? They are incapable of translating their intuitions into math? All three?)

[–] scruiser@awful.systems 11 points 1 week ago (1 children)

We did make fun of titotal for the effort they put into meeting rationalist on their own terms and charitably addressing their arguments and you know, being an EA themselves (albeit one of the saner ones)...

[–] scruiser@awful.systems 13 points 1 week ago* (last edited 1 week ago) (8 children)

So us sneerclubbers correctly dismissed AI 2027 as bad scifi with a forecasting model basically amounting to "line goes up", but if you end up in any discussions with people that want more detail titotal did a really detailed breakdown of why their model is bad, even given their assumptions and trying to model "line goes up": https://www.lesswrong.com/posts/PAYfmG2aRbdb74mEp/a-deep-critique-of-ai-2027-s-bad-timeline-models

tldr; the AI 2027 model, regardless of inputs and current state, has task time horizons basically going to infinity at some near future date because they set it up weird. Also the authors make a lot of other questionable choices and have a lot of other red flags in their modeling. And the picture they had in their fancy graphical interactive webpage for fits of the task time horizon is unrelated to the model they actually used and is missing some earlier points that make it look worse.

[–] scruiser@awful.systems 8 points 2 weeks ago

If you wire the LLM directly into a proof-checker (like with AlphaGeometry) or evaluation function (like with AlphaEvolve) and the raw LLM outputs aren't allowed to do anything on their own, you can get reliability. So you can hope for better, it just requires a narrow domain and a much more thorough approach than slapping some extra firm instructions in an unholy blend of markup languages in the prompt.

In this case, solving math problems is actually something Google search could previously do (before dumping AI into it) and Wolfram Alpha can do, so it really seems like Google should be able to offer a product that does math problems right. Of course, this solution would probably involve bypassing the LLM altogether through preprocessing and post processing.

Also, btw, LLM can be (technically speaking) deterministic if the heat is set all the way down, its just that this doesn't actually improve their performance at math or anything else. And it would still be "random" in the sense that minor variations in the prompt or previous context can induce seemingly arbitrary changes in output.

[–] scruiser@awful.systems 9 points 2 weeks ago (3 children)

Have they fixed it as in genuinely uses python completely reliably or "fixed" it, like they tweaked the prompt and now it use python 95% of the time instead of 50/50? I'm betting on the later.

 

I found a neat essay discussing the history of Doug Lenat, Eurisko, and cyc here. The essay is pretty cool, Doug Lenat made one of the largest and most systematic efforts to make Good Old Fashioned Symbolic AI reach AGI through sheer volume and detail of expert system entries. It didn't work (obviously), but what's interesting (especially in contrast to LLMs), is that Doug made his business, Cycorp actually profitable and actually produce useful products in the form of custom built expert systems to various customers over the decades with a steady level of employees and effort spent (as opposed to LLM companies sucking up massive VC capital to generate crappy products that will probably go bust).

This sparked memories of lesswrong discussion of Eurisko... which leads to some choice sneerable classic lines.

In a sequence classic, Eliezer discusses Eurisko. Having read an essay explaining Eurisko more clearly, a lot of Eliezer's discussion seems a lot emptier now.

To the best of my inexhaustive knowledge, EURISKO may still be the most sophisticated self-improving AI ever built - in the 1980s, by Douglas Lenat before he started wasting his life on Cyc. EURISKO was applied in domains ranging from the Traveller war game (EURISKO became champion without having ever before fought a human) to VLSI circuit design.

This line is classic Eliezer dunning-kruger arrogance. The lesson from Cyc were used in useful expert systems and effort building the expert systems was used to continue to advance Cyc, so I would call Doug really successful actually, much more successful than many AGI efforts (including Eliezer's). And it didn't depend on endless VC funding or hype cycles.

EURISKO used "heuristics" to, for example, design potential space fleets. It also had heuristics for suggesting new heuristics, and metaheuristics could apply to any heuristic, including metaheuristics. E.g. EURISKO started with the heuristic "investigate extreme cases" but moved on to "investigate cases close to extremes". The heuristics were written in RLL, which stands for Representation Language Language. According to Lenat, it was figuring out how to represent the heuristics in such fashion that they could usefully modify themselves without always just breaking, that consumed most of the conceptual effort in creating EURISKO.

...

EURISKO lacked what I called "insight" - that is, the type of abstract knowledge that lets humans fly through the search space. And so its recursive access to its own heuristics proved to be for nought. Unless, y'know, you're counting becoming world champion at Traveller without ever previously playing a human, as some sort of accomplishment.

Eliezer simultaneously mocks Doug's big achievements but exaggerates this one. The detailed essay I linked at the beginning actually explains this properly. Traveller's rules inadvertently encouraged a narrow degenerate (in the mathematical sense) strategy. The second place person actually found the same broken strategy Doug (using Eurisko) did, Doug just did it slightly better because he had gamed it out more and included a few ship designs that countered the opponent doing the same broken strategy. It was a nice feat of a human leveraging a computer to mathematically explore a game, it wasn't an AI independently exploring a game.

Another lesswronger brings up Eurisko here. Eliezer is of course worried:

This is a road that does not lead to Friendly AI, only to AGI. I doubt this has anything to do with Lenat's motives - but I'm glad the source code isn't published and I don't think you'd be doing a service to the human species by trying to reimplement it.

And yes, Eliezer actually is worried a 1970s dead end in AI might lead to FOOM and AGI doom. To a comment here:

Are you really afraid that AI is so easy that it's a very short distance between "ooh, cool" and "oh, shit"?

Eliezer responds:

Depends how cool. I don't know the space of self-modifying programs very well. Anything cooler than anything that's been tried before, even marginally cooler, has a noticeable subjective probability of going to shit. I mean, if you kept on making it marginally cooler and cooler, it'd go to "oh, shit" one day after a sequence of "ooh, cools" and I don't know how long that sequence is.

Fearmongering back in 2008 even before he had given up and gone full doomer.

And this reminds me, Eliezer did not actually predict which paths lead to better AI. In 2008 he was pretty convinced Neural Networks were not a path to AGI.

Not to mention that neural networks have also been "failing" (i.e., not yet succeeding) to produce real AI for 30 years now. I don't think this particular raw fact licenses any conclusions in particular. But at least don't tell me it's still the new revolutionary idea in AI.

Apparently it took all the way until AlphaGo (sometime 2015 to 2017) for Eliezer to start to realize he was wrong. (He never made a major post about changing his mind, I had to reconstruct this process and estimate this date from other lesswronger's discussing it and noticing small comments from him here and there.) Of course, even as late as 2017, MIRI was still neglecting neural networks to focus on abstract frameworks like "Highly Reliable Agent Design".

So yeah. Puts things into context, doesn't it.

Bonus: One of Doug's last papers, which lists out a lot of lessons LLMs could take from cyc and expert systems. You might recognize the co-author, Gary Marcus, from one of the LLM critical blogs: https://garymarcus.substack.com/

 

So, lesswrong Yudkowskian orthodoxy is that any AGI without "alignment" will bootstrap to omnipotence, destroy all mankind, blah, blah, etc. However, there has been the large splinter heresy of accelerationists that want AGI as soon as possible and aren't worried about this at all (we still make fun of them because what they want would result in some cyberpunk dystopian shit in the process of trying to reach it). However, even the accelerationist don't want Chinese AGI, because insert standard sinophobic rhetoric about how they hate freedom and democracy or have world conquering ambitions or they simply lack the creativity, technical ability, or background knowledge (i.e. lesswrong screeds on alignment) to create an aligned AGI.

This is a long running trend in lesswrong writing I've recently noticed while hate-binging and catching up on the sneering I've missed (I had paid less attention to lesswrong over the past year up until Trump started making techno-fascist moves), so I've selected some illustrative posts and quotes for your sneering.

  • Good news, China actually has no chance at competing at AI (this was posted before deepseek was released). Well. they are technically right that China doesn't have the resources to compete in scaling LLMs to AGI because it isn't possible in the first place

China has neither the resources nor any interest in competing with the US in developing artificial general intelligence (AGI) primarily via scaling Large Language Models (LLMs).

  • The Situational Awareness Essays make sure to get their Yellow Peril fearmongering on! Because clearly China is the threat to freedom and the authoritarian power (pay no attention to the techbro techno-fascist)

In the race to AGI, the free world’s very survival will be at stake. Can we maintain our preeminence over the authoritarian powers?

  • More crap from the same author
  • There are some posts pushing back on having an AGI race with China, but not because they are correcting the sinophobia or the delusions LLMs are a path to AGI, but because it will potentially lead to an unaligned or improperly aligned AGI
  • And of course, AI 2027 features a race with China that either the US can win with a AGI slowdown (and an evil AGI puppeting China) or both lose to the AGI menance. Featuring "legions of CCP spies"

Given the “dangers” of the new model, OpenBrain “responsibly” elects not to release it publicly yet (in fact, they want to focus on internal AI R&D). Knowledge of Agent-2’s full capabilities is limited to an elite silo containing the immediate team, OpenBrain leadership and security, a few dozen US government officials, and the legions of CCP spies who have infiltrated OpenBrain for years.

  • Someone asks the question directly Why Should I Assume CCP AGI is Worse Than USG AGI?. Judging by upvoted comments, lesswrong orthodoxy of all AGI leads to doom is the most common opinion, and a few comments even point out the hypocrisy of promoting fear of Chinese AGI while saying the US should race for AGI to achieve global dominance, but there are still plenty of Red Scare/Yellow Peril comments

Systemic opacity, state-driven censorship, and state control of the media means AGI development under direct or indirect CCP control would probably be less transparent than in the US, and the world may be less likely to learn about warning shots, wrongheaded decisions, reckless behaviour, etc. True, there was the Manhattan Project, but that was quite long ago; recent examples like the CCP's suppression of information related to the origins of COVID feel more salient and relevant.

 

I am still subscribed to slatestarcodex on reddit, and this piece of garbage popped up on my feed. I didn't actually read the whole thing, but basically the author correctly realizes Trump is ruining everything in the process of getting at "DEI" and "wokism", but instead of accepting the blame that rightfully falls on Scott Alexander and the author, deflects and blames the "left" elitists. (I put left in quote marks because the author apparently thinks establishment democrats are actually leftist, I fucking wish).

An illustrative quote (of Scott's that the author agrees with)

We wanted to be able to hold a job without reciting DEI shibboleths or filling in multiple-choice exams about how white people cause earthquakes. Instead we got a thousand scientific studies cancelled because they used the string “trans-” in a sentence on transmembrane proteins.

I don't really follow their subsequent points, they fail to clarify what they mean... In sofar as "left elites" actually refers to centrist democrats, I actually think the establishment Democrats do have a major piece of blame in that their status quo neoliberalism has been rejected by the public but the Democrat establishment refuse to consider genuinely leftist ideas, but that isn't the point this author is actually going for... the author is actually upset about Democrats "virtue signaling" and "canceling" and DEI, so they don't actually have a valid point, if anything the opposite of one.

In case my angry disjointed summary leaves you any doubt the author is a piece of shit:

it feels like Scott has been reading a lot of Richard Hanania, whom I agree with on a lot of points

For reference the ssc discussion: https://www.reddit.com/r/slatestarcodex/comments/1jyjc9z/the_edgelords_were_right_a_response_to_scott/

tldr; author trying to blameshift on Trump fucking everything up while keeping up the exact anti-progressive rhetoric that helped propel Trump to victory.

 

So despite the nitpicking they did of the Guardian Article, it seems blatantly clear now that Manifest 2024 was infested by racists. The post article doesn't even count Scott Alexander as "racist" (although they do at least note his HBD sympathies) and identify a count of full 8 racists. They mention a talk discussing the Holocaust as a Eugenics event (and added an edit apologizing for their simplistic framing). The post author is painfully careful and apologetic to distinguish what they personally experienced, what was "inaccurate" about the Guardian article, how they are using terminology, etc. Despite the author's caution, the comments are full of the classic SSC strategy of trying to reframe the issue (complaining the post uses the word controversial in the title, complaining about the usage of the term racist, complaining about the threat to their freeze peach and open discourse of ideas by banning racists, etc.).

view more: next ›