this post was submitted on 15 Sep 2024
133 points (100.0% liked)
Steam Deck
14810 readers
38 users here now
A place to discuss and support all things Steam Deck.
Replacement for r/steamdeck_linux.
As Lemmy doesn't have flairs yet, you can use these prefixes to indicate what type of post you have made, eg:
[Flair] My post title
The following is a list of suggested flairs:
[Discussion] - General discussion.
[Help] - A request for help or support.
[News] - News about the deck.
[PSA] - Sharing important information.
[Game] - News / info about a game on the deck.
[Update] - An update to a previous post.
[Meta] - Discussion about this community.
Some more Steam Deck specific flairs:
[Boot Screen] - Custom boot screens/videos.
[Selling] - If you are selling your deck.
These are not enforced, but they are encouraged.
Rules:
- Follow the rules of Sopuli
- Posts must be related to the Steam Deck in an obvious way.
- No piracy, there are other communities for that.
- Discussion of emulators are allowed, but no discussion on how to illegally acquire ROMs.
- This is a place of civil discussion, no trolling.
- Have fun.
founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Here is my view and a small timeline:
I am curious as to why they would offload any AI tasks to another chip? I just did a super quick search for upscaling models on GitHub (https://github.com/marcan/cl-waifu2x/tree/master/models) and they are tiny as far as AI models go.
Its the rendering bit that takes all the complex maths, and if that is reduced, that would leave plenty of room for running a baby AI. Granted, the method I linked to was only doing 29k pixels per second, but they said they weren't GPU optimized. (FSR4 is going to be fully GPU optimized, I am sure of it.)
If the rendered image is only 85% of a 4k image, that's ~1.2 million pixels that need to be computed and it still seems plausible to keep everything on the GPU.
With all of that blurted out, is FSR4 AI going to be offloaded to something else? It seems like there would be a significant technical challenges in creating another data bus that would also have to sync with memory and the GPU for offloading AI compute at speeds that didn't risk create additional lag. (I am just hypothesizing, btw.)
The thing with “AI” or better still, ML cores, is that they’re very specialized. Apple hasn’t been slapping ML cores in all of their cpus since the iPhone 8 because they are super powerful, it’s because they can do some things (that the hardware would have no problem doing anyway) by sipping power. You don’t have to think about AI as in the requirements for huge LLM like ChatGPT that require data centers, think about it like a hardware video decoder: This thing could play easily 1080p video! Or, going with raw cpu power rather than hardware decoding, 480p. It’s why you can watch hours of videos on your phone, but try doing anything that hits the cpu and the battery melts.
Edit: my example has been bothering me for days now. I want to clarify to avoid any possible misunderstanding that hardware video decoding has nothing to do with AI, it’s just another very specialized chip.
Well, Nvidia and Intel does that too, and I think Sony added an AI chip to the PS5 Pro for their new AI upscaler as well. We can already run AI calculations on our GPU without AI accleration, but that is not as fast. I have no numbers for you, only the logic that optimized software to use optimized AI chips should run more efficient and faster, without slowing down the regular GPU work. Intel is in this hybrid state, where they support both. One version of XESS can run on all GPUs, but that is worse than XESS specialized for Intel GPUs with their dedicated AI accelerators.
Those upscaler you linked are only upscaling non interactive video or single frames, right? An AI upscaler on live gameplay takes much more into consideration, like menus, specific parts of the image being background and such. These information are programmed into the game, so its drastically different approach from just images upscaling, which wouldn't be different than FSR 1 in such a case. But I have no clue about numbers and how it compares to a solution like that.
I don't think this is a decision they just made recently and probably was planning long before they even started on FSR 4, plus they were already working for 12 months or so on it (allegedly). I think AMD "needs" to do this AI offloading, because market demands it, traditional solution didn't workout as hoped and maybe in co operation with Valve, Microsoft and other vendors. On the other side, this AI acclerator could be used for anything else than upscaling as well, as Nvidia demonstrated.
One technical reason for why FSR 1 isn't very good but works in everything is that FSR1 is the only one that just takes your current frame and upscales it, all the newer ones are all temporal - like TAA - and use data from multiple previous frames.
Very simplified, they "jiggle" the camera each frame to a different position so that they can gather extra data to use, but that requires being implemented in the game engine directly.
Kind of.
The big thing that actually defines FSR2 is that it has access to a bunch more data, particularly the depth buffer, motion vectors, and also, as you said, uses data from previous frames.
The camera jiggle is mostly just to avoid shimmering when the camera is stationary.