mediocreatbest

joined 1 year ago
MODERATOR OF
[–] [email protected] 5 points 1 year ago

If you've never seen this before, I think it's transformative to how you read C/C++ declarations and clearer up a lot of confusion for me when I was learning.

https://cseweb.ucsd.edu/~gbournou/CSE131/rt_lt.rule.html

 

echo 1 | sudo tee /sys/bus/pci/<pci-id-of-device>/remove and then echo 1 | sudo tee /sys/bus/pci/rescan

 

I'm a little unsure on if I interpreted the results correctly. It seems like some things that TF Lite natively supports (apparently, their custom CNN model trained on MNIST) get really fast, and other things are a little hit-or-miss.

 

I have linked the pricing page because I think that's the most important aspect to a service like this.

The price isn't too expensive, but it also isn't particular cheap either.

Compared to OpenAI's ChatGPT model and generating 1 million tokens (i.e. the King James Bible), you're looking at:

  • OpenAI's gpt-3.5-turbo ("ChatGPT-3.5") is $2 / 1m tokens
  • TextSynth's M2M100 1.2B (cheapest) is $3 / 1m tokens
  • OpenAI's gpt-4 ("ChatGPT-4") is $4 / 1m tokens
  • TextSynth's GPT-Neox 20B (most expensive) is $35 / 1m tokens
[–] [email protected] 1 points 1 year ago

I don't know what kind of comments and posts you've made on Reddit, but if any of them are technical how-to's or something that may come up when people search for specific problems, then it might be good to leave those comments, or else just prefix each comment with your "purged" message instead of overwriting them entirely. I mean if it's fun memes or discussions, then you do you 😅 I'm just thinking of the tale of DenverCoder9. Plus, it probably costs more for Reddit to store a longer comment than a shorter one! Pennies or less, but still!

 

Abstract: "Prompting is now the primary way to utilize the multitask capabilities of language models (LMs), but prompts occupy valuable space in the input context window, and re-encoding the same prompt is computationally inefficient. Finetuning and distillation methods allow for specialization of LMs without prompting, but require retraining the model for each task. To avoid this trade-off entirely, we present gisting, which trains an LM to compress prompts into smaller sets of "gist" tokens which can be reused for compute efficiency. Gist models can be easily trained as part of instruction finetuning via a restricted attention mask that encourages prompt compression. On decoder (LLaMA-7B) and encoder-decoder (FLAN-T5-XXL) LMs, gisting enables up to 26x compression of prompts, resulting in up to 40% FLOPs reductions, 4.2% wall time speedups, storage savings, and minimal loss in output quality. "

view more: next ›