this post was submitted on 27 Oct 2023
282 points (98.3% liked)

Asklemmy

43908 readers
1315 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy ๐Ÿ”

If your post meets the following criteria, it's welcome here!

  1. Open-ended question
  2. Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
  3. Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
  4. Not ad nauseam inducing: please make sure it is a question that would be new to most members
  5. An actual topic of discussion

Looking for support?

Looking for a community?

~Icon~ ~by~ ~@Double_[email protected]~

founded 5 years ago
MODERATORS
 

I would love the child of a Surfacebook with a Framework laptop; or A bare keyboard attached to a screen, that I could plug my phone (possibly running Phosh) and use it as a hardware for a laptop experience

you are viewing a single comment's thread
view the rest of the comments
[โ€“] [email protected] 14 points 1 year ago* (last edited 1 year ago) (1 children)
  • Open source motherboards
  • Open source modems for computers and phones
  • Open source cars
  • GrapheneOS phone with enough RAM to run a decent offline LLM
  • Offline AI privacy/network manager designed to white noise the stalkerware standards of the shitternet with a one click setup
  • Real AI hardware designed for tensor math using standard DIMM system memory with many slots and busses in parallel instead of bleeding edge monolithic GPU stuff targeting a broad market. The bottle neck in the CPU structure is the L2 to L1 cache bus width and transfer rate with massive tensor tables that all need to run at one time. System memory is great for its size but its size is only possible because of the memory controller that swaps out a relatively small chunk that is actually visible to the CPU. This is opposed to a GPU where there is no memory controller and the memory size is directly tied to the compute hardware. This is the key difference we need to replicate. We need is a bunch of small system memory sticks where the chunk normally visible to the CPU is all that is used and a bunch of these sticks on their own busses running to the compute hardware. Then older, super cheap system memory could be paired with ultra cheap trailing edge compute hardware to make cheaper AI that could run larger models, (at the cost of more power consumption). Currently larger than 24GBV GPUs are pretty much unobtainium, like an A6000 at 48GBV will set you back at least $4k. I want to run a 70B or greater. That would need ~140GBV to run super fast on dedicated optimised hardware. There is already an open source offline 180B model, and that would need ~360GBV for near instantaneous response. While super speeds with these large models is not needed for basic LLM prompting, it makes a big difference with agents where the model needs to do a bunch of stuff seamlessly while still appearing to work in realtime conversationally.
[โ€“] [email protected] 3 points 1 year ago

Real AI hardware designed for tensor math

Coral TPU