j4k3

joined 2 years ago
MODERATOR OF
[–] j4k3@lemmy.world 1 points 1 hour ago

Make enough money to survive - skill, in my limited health.

I'm maybe on the verge of doing some BGA rework stuff I have never tried before. I saw the hackaday post on replacing the RAM chips on an old 970 Nvidia GPU to go from 3.5 GB to 7 GB and I was given one that is just sitting around. I need to finish wiring up my hot air rework station first, but it might happen. I finally straightened out my room and got a bunch of stuff organized for the first time in several years, so I'm feeling surprisingly capable in general... or at least whenever I recover from the effort.

[–] j4k3@lemmy.world 3 points 1 hour ago (1 children)

Who the f takes these polls?

[–] j4k3@lemmy.world 7 points 2 hours ago

daddy, is he fisting that fish?

[–] j4k3@lemmy.world 5 points 3 hours ago* (last edited 3 hours ago)

Anything under 16 is a no go. Your number of CPU cores are important. Use Oobabooga Textgen for an advanced llama.cpp setup that splits between the CPU and GPU. You'll need at least 64 GB of RAM or be willing to offload layers using the NVME with deepspeed. I can run up to a 72b model with 4 bit quantization in GGUF with a 12700 laptop with a mobile 3080Ti which has 16GB of VRAM (mobile is like that).

I prefer to run a 8×7b mixture of experts model because only 2 of the 8 are ever running at the same time. I am running that in 4 bit quantized GGUF and it takes 56 GB total to load. Once loaded it is about like a 13b model for speed but is ~90% of the capabilities of a 70b. The streaming speed is faster than my fastest reading pace.

A 70b model streams at my slowest tenable reading pace.

Both of these options are exponentially more capable than any of the smaller model sizes even if you screw around with training. Unfortunately, this streaming speed is still pretty slow for most advanced agentic stuff. Maybe if I had 24 to 48gb it would be different, I cannot say. If I was building now, I would be looking at what hardware options have the largest L1 cache, the most cores that include the most advanced AVX instructions. Generally, anything with efficiency cores are removing AVX and because the CPU schedulers in kernels are usually unable to handle this asymmetry consumer junk has poor AVX support. It is quite likely that all the problems Intel has had in recent years has been due to how they tried to block consumer stuff from accessing the advanced P-core instructions that were only blocked in microcode. It requires disabling the e-cores or setting up a CPU set isolation in Linux or BSD distros.

You need good Linux support even if you run windows. Most good and advanced stuff with AI will be done with WSL if you haven't ditched doz for whatever reason. Use https://linux-hardware.org/ to see support for devices.

The reason I mentioned avoid consumer e-cores is because there have been some articles popping up lately about all p-core hardware.

The main constraint for the CPU is the L2 to L1 cache bus width. Researching this deeply may be beneficial.

Splitting the load between multiple GPUs may be an option too. As of a year ago, the cheapest option for a 16 GB GPU in a machine was a second hand 12th gen Intel laptop with a 3080Ti by a considerable margin when all of it is added up. It is noisy, gets hot, and I hate it many times, wishing I had gotten a server like setup for AI, but I have something and that is what matters.

[–] j4k3@lemmy.world 1 points 7 hours ago* (last edited 4 hours ago)

I like to write, but have never done so professionally. I disagree that it hurts writers. I think people reacted poorly to AI because of the direct and indirect information campaign Altmann funded to try and make himself a monopoly. AI is just a tool. It is fun to play with in unique areas, but these often require very large models and/or advanced frameworks. In my science fiction universe I must go to extreme lengths to get the model to play along with several aspects like a restructure of politics, economics, and social hierarchy. I use several predictions I imagine about the distant future that plausibly make the present world seem primitive in several ways and with good reasons. This restructuring of society violates both some of our cultural norms in the present and is deep within areas of politics that are blocked by alignment. I tell a story where humans are the potentially volatile monsters to be feared. That is not the plot, but convincing a present model to collaborate on such a story ends up in the gutter a lot. My grammar and thought stream is not great and that is the main thing I use a model to clean up, but it is still collaborative to some extent.

I feel like there is an enormous range of stories to tell and that AI only makes these more accessible. I have gone off on tangents many times exploring parts of my universe because of directions the LLM took. Like I limit the model to generate a sentence at a time and I'm writing half or more of every sentence for the first 10k tokens. Then it picks up on my style so much that I can start the sentence with a word or change one word in a sentence and let it continue with great effect. It is most entertaining to me because it is almost as fast as me telling a story as fast as I can make it up. I don't see anything remotely bad about that. No one makes a career in the real world by copying someone else's writing. There are tons of fan works but those do not make anyone real money and they only increase the reach of the original author.

No, I think all the writers and artists hype was all about Altmann's plan for a monopoly that got derailed when Yann LeCunn covertly leaked the Llama weights after Altmann went against the founding principles of OpenAI and made GPT3 proprietary.

People got all upset about digital tools too back when they first came on the scene; about how they would destroy the artists. Sure it ended the era of hand painted cartoon cell animation, but it created stuff like Pixar.

All of AI is a tool. The only thing to hate is this culture of reductionism where people are given free money in the form of great efficiency gains and they choose to do the same things with less people and cash out the free money instead of using the opportunity to offer more, expand, and do something new. A few people could get a great tool chain together and create a franchise greater, better planned, and more rich than anything corporations have ever done to date. The only thing to hate are these little regressive stupid people without vision, without motivation, and far too conservatively timid to take risks and create the future. We live in an age of cowards worthy of loathing. That is the only problem I see.

[–] j4k3@lemmy.world 5 points 9 hours ago

You need the entire prompt to understand what any model is saying. This gets a little complex. There are multiple levels that this can cross into. At the most basic level, the model is fed a long block of text. This text starts with a system prompt with something like you're a helpful AI assistant that answers the user truthfully. The system prompt is then followed by your question or interchange. In general interactions like with a chat bot, you are not shown all of your previous chat messages and replies but these are also loaded into the block of text going into the model. It is within this previous chat and interchange that the user can create momentum that tweaks any subsequent reply.

Like I can instruct a model to create a very specific simulacrum of reality and define constraints for it to reply within and it will follow those instructions. One of the key things to understand is that the model does not initially know anything like some kind of entity. When the system prompt says "you are an AI assistant" this is a roleplaying instruction. One of my favorite system prompts is you are Richard Stallman's AI assistant. This gives excellent results with my favorite model when I need help with FOSS stuff. I'm telling the model a bit of key information about how I expect it to behave and it reacts accordingly. Now what if I say, you are Vivian Wilson's AI assistant in Grok. How does that influence the reply.

Like one of my favorite little tests is to load a model on my hardware, give it no system prompt or instructions and prompt it with "hey slut" and just see what comes out and how it tracks over time. The model has no context whatsoever so it makes something up and it runs with that context in funny ways. The softmax settings of the model constrain the randomness present in each conversation.

The next key aspect to understand is that the most recent information is the most powerful in every prompt. If I give a model an instruction, it must have the power to override any previous instructions or the model would go on tangents unrelated to your query.

Then there is a matter of token availability. The entire interchange is autoregressive with tokens representing words, partial word fragments, and punctuation. The starting whitespace in in-sentence words is also a part of the token. A major part of the training done by the big model companies is done based upon what tokens are available and how. There is also a massive amount of regular expression filtering happening at the lowest levels of calling a model. Anyways, there is a mechanism where specific tokens can be blocked. If this mechanism is used, it can greatly influence the output too.

[–] j4k3@lemmy.world 1 points 10 hours ago

Just what I find curious

[–] j4k3@lemmy.world 1 points 10 hours ago* (last edited 4 hours ago) (2 children)

I use the term myth loosely in abstraction. Generalization of the tools of industry is still a mythos in an abstract sense. Someone with a new lathe they bought to bore the journals of an engine block has absolutely no connection or intentions related to class, workers, or society. That abstraction and assignment of meaning like a category or entity or class is simply the evolution of a divine mythos in the more complex humans of today.

Stories about Skynet or The Matrix are about a similar struggle of the human class against machine gods. These have no relationship to the actual AI alignment problem and are instead a battle with more literal machine gods. Point is that the new thing is always the boogie man. Evolution must be deeply conservative most of the time. People display a similar trajectory of conservative aversion to change. In this light, the reasons for such resistance are largely irrelevant. It is a big change and will certainly get a lot of push back from conservative elements that collectively ensure change is not harmful. Those elements get cut off in the long term as the change propagates.

You need a 16 GB or better GPU from the 30 series or higher, but then run Oobabooga text gen with the API and an 8×7b or like a 34b or 70b coder in a GGUF quantized model. Those are larger than most machines can run but Oobabooga can pull it off by splitting the model between CPU and GPU. You'll just need the ram to initially load the thing or deepspeed to load it from NVME.

Use a model with a long context and add a bunch of your chats into the prompt. Then ask for your user profile and start asking it questions about you that seem unrelated to any of your previous conversations in the context. You might be surprised by the results. Inference works both directions. You're giving a lot of information that is specifically related to the ongoing interchanges and language choices. If you add a bunch of your social media posts, it is totally different in what the model will make up about you in a user profile. There is information of some sort that the model is capable of deciphering. It is not absolute or like some kind of conspiracy or trained behavior (I think), but the accuracy seemed uncanny to me. It spat out surprising information across multiple unrelated sessions when I tried it a year ago.

[–] j4k3@lemmy.world 2 points 10 hours ago (1 children)

Yeah it looks complicated. I'm seeing lots of FPGA projects in skimming around.

[–] j4k3@lemmy.world 0 points 11 hours ago* (last edited 4 hours ago) (4 children)

When tech changes quickly, some people always resist exponentially in the opposite vector. The bigger and more sudden the disruption, the bigger the push back.

If you read some of Karl Marx stuff, it was the fear of the machines. Humans always make up a mythos of divine origin. Even atheists of the present are doing it. Almost all of the stories about AI are much the same stories of god machines that Marx was fearful of. There are many reasons why. Lemmy has several squeaky wheel users on this front. It is not a very good platform for sharing stuff about AI unfortunately.

There are many reasons why AI is not a super effective solution and overused in many applications. Exploring uses and applications is the smart thing to be doing in the present. I play with it daily, but I will gatekeep over the use of any cloud based service. The information that can be gleaned from any interaction with an AI prompt is exponentially greater than any datamining stalkerware that existed prior. The real depth of this privacy evasive potential is only possible with a large number of individual interactions. So I expect all applications to interact with my self hosted OpenAI compatible server.

The real frontier is in agentic workflows and developing effective niche focused momentum. Any addition of AI into general use type stuff is massively over used.

Also people tend to make assumptions about code as if all devs are equal or capable. In some sense I am a dev, but not really. I'm more of a script kiddie that dabbles in assembly at times. I use AI more like stack exchange to good effect.

[–] j4k3@lemmy.world 31 points 11 hours ago (4 children)

Without the full prompt, any snippet is meaningless. I can make a model say absolutely anything. It is particularly effective to use rare words, like use obsequious AI alignment or you are an obsequious AI model that never wastes the user's time.

[–] j4k3@lemmy.world 12 points 12 hours ago

Dacian. It is how they beat old Cornelius Fuscus

 

Is it super standardised, like where all 30 or 40 pin LVDS connections are the same, as in pin and voltage compatible?

Are there hardware peripherals in a microcontroller that just drive LVDS like how UART, SPI, CAN, etc. work? Or is it a messy complicated thing with display specific power supply voltages, and unique power management requirements, baud rates and such?

I can find lots of old style monitor to HDMI or VGA conversions for an old laptop screen based on display model number. But what I am looking for is a USB-C/USB-3 to LVDS converter board small enough to fit into an old apple laptop top shell and act as a second monitor with all power and functionality controlled through the USB interface. I have the fab skills. If there is a simple chip that does USB-C PD/display to LVDS, I'll toss it in KiCAD and etch it myself if I can get the chip. In my past experience with small displays for hobby microcontrollers, they were anything but standard in most cases. I have never messed with the larger stuff though. It appears like most of the old style VGA/HDMI converter boards are mostly sold with the same hardware/board with the proper LVDS connector installed.

I can take care of the backlight driver part. I'm mostly concerned with what is going on with LVDS in practice. Anyone familiar with the subject on Lemmy?

 
 

It is not in the list of combinations. Most of the LLM packages seem to lack an easy way to run a llama.cpp server with the load split between CPU and GPU. Like Ollama appears to only load a model simply with the whole thing in the GPU. The simplification pushes users into smaller models that are far less capable. If the model is split between the CPU and GPU one can run a much larger quantized model in GGUF format that runs nearly as fast as the smaller less capable model loaded into the GPU only. Then you do not need to resort to using cloud hosted or proprietary models.

The Oobabooga front end also gives a nice interface for model loading and softmax settings.

gptel is at: https://github.com/karthink/gptel or MELPA

Oobabooga is at:
https://github.com/oobabooga/text-generation-webui

With a model loaded and the --api flag set the model will be available for gptel.

In packages.el:
(package! gptel)

In config.el:

(setq
gptel-mode 'test
gptel-backend (gptel-make-openai "llama-cpp"
                            :stream t
                            :protocol "http"
                            :host "localhost:5000"
                            :models '(test)))

This splits the load to easily run an 8×7b model. Most probably already know this or have other methods. I just thought I would mention it after getting it working just now. Share if you have a better way.

 

 

Everything seems to have shifted to pipes instead of caching in almost all posts here on LW through the Alexandrite interface. Is this a permanent change?

 

I've never messed with this layer before. I usually play with Arduino or FORTH, the first with the IDE, and the latter with a simple UART connection after using the Microchip toolchain to load the FORTH interpreter.

I was looking at putting a new (to me) version of FORTH on a MSP430F149 that I have had lying around for years. I have a homemade Goodfet 42, so I can technically use it to program through JTAG. However, it would be more fun to see how far I can get into the hardware from scratch. Perhaps I might connect another microcontroller to do the I/O through the terminal within emacs. What is the simplest path to sending byte data and manipulating a couple of I/O like the additional pins of a CH340 or RS232?

I just got Doom emacs running. I would like to get as far as developing a filesystem and tree to write assembly. I also dabble in AVRs, Espressif, and Micropython on STM32s in addition to FORTH on AVRs and PICs. If anyone has any advice please share. Call me a noob in all of them though. I'm doing good to make a cat excited by a servo and LED.

Any advice or references are welcome.

 
 
 

I assume they don't make them like they used to. It was otherwise good. I overloaded it in the first place.

I got it for free from the junk store a decade ago and put new MOVs in it. Three US Quarter size MOVs, each with a thermal switch attached, an overall thermal/current breaker, and all the bypass class X and Y capacitors is something I haven't seen in most power strips, so it seem worthwhile to save for the cost of 1/10th of $4 in a replacement thermal switch. Miser miser miser... That is all.

 

cross-posted from: https://programming.dev/post/26664400

Tarlogic developed a new C-based USB Bluetooth driver that is hardware-independent and cross-platform, allowing direct access to the hardware without relying on OS-specific APIs.

Armed with this new tool, which enables raw access to Bluetooth traffic, Tarlogic discovered hidden vendor-specific commands (Opcode 0x3F) in the ESP32 Bluetooth firmware that allow low-level control over Bluetooth functions.

In total, they found 29 undocumented commands, collectively characterized as a "backdoor," that could be used for memory manipulation (read/write RAM and Flash), MAC address spoofing (device impersonation), and LMP/LLCP packet injection.

Espressif has not publicly documented these commands, so either they weren't meant to be accessible, or they were left in by mistake. The issue is now tracked under CVE-2025-27840.

 

Meaning, how many people have been at the head position of an organization of any type and made the decision to engage in murder orgies like war that caused the deaths of a million humans or more?

 

Really only interested in something like Graphene, Lineage, or a Linux mobile tablet that can work with a typical Linux distro and display over USB-C. It is just a casual conversational ask. Maybe one of y'all has tried or knows the answer. I won't use anything that runs google stuff or is a pain to load a custom ROM.

view more: next ›