this post was submitted on 29 Jun 2023
1 points (100.0% liked)

Technology

34728 readers
164 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago
MODERATORS
 

cross-posted from: https://lemmy.world/post/800062

Eric Hartford (a.k.a. faldore) has announced OpenOrca, an open-source dataset and series of instruct-tuned language models he plans to release alongside Microsoft's new open-source challenger, Orca.

You can support Eric and all of the hard work he has done for the open-source community by following his newsletter on his site here.

Eric, if you're reading this and would like to share a donation link - I would be more than happy to include it on this post and any future regarding your work. Shoot me a message anytime.

Eric Hartford's Announcement

Today I'm announcing OpenOrca.

https://erichartford.com/openorca

https://twitter.com/erhartford/status/1674214496301383680

The dataset is completed. ~1mil of GPT4 augmented flanv2 instructions and ~3.5mil of GPT3.5 augmented flanv2 instructions.

We are currently training on LLaMA-13b. We expect completion in about 2 weeks.

When training is complete, we will release the dataset and the model at the same time.

We are seeking GPU compute sponsors for various targets, please consult the blog post and reach out if interested.

Thank you to our sponsors!

https://chirper.ai

https://preemo.io

https://latitude.sh

A few more highlights from the full article, which you should read here when you have a chance.

We expect to release OpenOrca-LLaMA-13b in mid-July 2023. At that time we will publish our evaluation findings and the dataset.

We are currently seeking GPU compute sponsors for training OpenOrca on the following platforms:

Falcon 7b, 40b

LLaMA 7b, 13b, 33b, 65b

MPT-7b, 30b

Any other targets that get a sponsor. (RWKV, OpenLLaMA)

Dataset consists of:

  • ~1 million of FLANv2 augmented with GPT-4 completions

  • ~3.5 million of FLANv2 augmented with GPT-3.5 completions

If you found this post interesting, please consider subscribing to the /c/FOSAI community at [email protected] where I do my best to keep you in the know with the most important updates in free open-source artificial intelligence.

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here