this post was submitted on 05 Apr 2024
46 points (96.0% liked)

Learning Rust and Lemmy

391 readers
1 users here now

Welcome

A collaborative space for people to work together on learning Rust, learning about the Lemmy code base, discussing whatever confusions or difficulties we're having in these endeavours, and solving problems, including, hopefully, some contributions back to the Lemmy code base.

Rules TL;DR: Be nice, constructive, and focus on learning and working together on understanding Rust and Lemmy.


Running Projects


Policies and Purposes

  1. This is a place to learn and work together.
  2. Questions and curiosity is welcome and encouraged.
  3. This isn't a technical support community. Those with technical knowledge and experienced aren't obliged to help, though such is very welcome. This is closer to a library of study groups than stackoverflow. Though, forming a repository of useful information would be a good side effect.
  4. This isn't an issue tracker for Lemmy (or Rust) or a place for suggestions. Instead, it's where the nature of an issue, what possible solutions might exist and how they could be or were implemented can be discussed, or, where the means by which a particular suggestion could be implemented is discussed.

See also:

Rules

  1. Lemmy.ml rule 2 applies strongly: "Be respectful, even when disagreeing. Everyone should feel welcome" (see Dessalines's post). This is a constructive space.
  2. Don't demean, intimidate or do anything that isn't constructive and encouraging to anyone trying to learn or understand. People should feel free to ask questions, be curious, and fill their gaps knowledge and understanding.
  3. Posts and comments should be (more or less) within scope (on which see Policies and Purposes above).
  4. See the Lemmy Code of Conduct
  5. Where applicable, rules should be interpreted in light of the Policies and Purposes.

Relevant links and Related Communities


Thumbnail and banner generated by ChatGPT.

founded 9 months ago
MODERATORS
 

Hey!

I'm a professional software engineer with several years of experience using Rust. Unfortunately I don't really have the time to contribute to Lemmy directly myself, but I love teaching other people Rust so if:

  • You are curious about Rust and why you should even learn it
  • You are trying to learn Rust but maybe having a hard time
  • You are wondering where to start
  • You ran into some specific issue

... or anything to do with Rust really, then feel free to ask in the comments or shoot me a PM 🙂

top 45 comments
sorted by: hot top controversial new old
[–] [email protected] 5 points 7 months ago (1 children)

Are there resources to learn rust as first language? I’ve read some time ago that the best path was still “how to think like a computer scientist” and then switching to the rust books

[–] [email protected] 1 points 7 months ago (1 children)

Unfortunately there's not really anything to teach it as the first language. But I think it's still possible if you want

[–] [email protected] 2 points 7 months ago (1 children)

wouldn't it be possible to just ''translate'' and adapt ''how to think etc'' in rust? or the famous c book of whose name i don't remember right now lol

[–] [email protected] 3 points 7 months ago

If you know C or C++, you would have an easier time learning Rust, I would say. But I think it would be easier to learn Rust from the start instead of learning C or C++ first, then Rust. But it depends on the person probably. I would say try Rust by Example if you're looking for something more practical maybe (I haven't tried it, just heard about it).

[–] [email protected] 5 points 7 months ago (1 children)

You are wondering where to start

Since a lot of people are new, this might be helpful! Maybe what your learning journey was like, and if there are any resources you recommend?

Also thank you :)

[–] [email protected] 6 points 7 months ago

TL;DR:

  • Learned basic C++ as my first language (would not recommend) many years ago
  • Went to study computer science at university
  • Was intrigued by Rust from online blog posts, learned it in my free time as a student
  • Have been working professionally with Rust for more than 2 years now

Long version:

I "learned" C++ as my first language many years ago. I say "learned" because I just learned the basic syntax and some of the concepts about computers, like how the stack and the heap works and processes and threads and stuff like that. It was very early and I didn't really understand much, was just experimenting.

I then started studying computer science. We were exposed to many different programming languages at the university. Over the years, you got a feel for the strengths and weaknesses of each one. Especially during my master's I learned a lot as I was exposed to Haskell and in general nicer-working type systems.

Around the same time I started noticing blog posts about Rust online. I became quite intrigued with the language as it had a lot of promise and I was curious if it could really live up to the hype.

I then just started reading The Book and I was very quickly convinced that this language was actually living up to the hype. It's a very nice blend of object-oriented and functional programming, with an algebraic type system and monad-like error handling, which is just miles ahead of the usual exception-based error handling that many other languages use.

Then I basically started using it in all my side-projects in my free time, slowly building up a familiarity with the package ecosystem and the idiomatic way of writing Rust, reading lots of examples and documentation and such.

Nowadays, I would humbly call myself an expert in Rust or at least close to it. I think I've seen nearly all the language has to offer and ran into many of its weirder parts. I have a very good understanding of ownership and the borrow checker and I usually know exactly what is wrong and what to do about it when the compiler yells at me. The only thing I haven't really touched is embedded (no-std) programming in Rust.

[–] [email protected] 4 points 7 months ago (1 children)

I'm halfway through the book and rustlings and it's... Awkward so far.

Is it possible to become as productive with rust as one is with higher level languages?

Or should I stick to gc languages for domains where gcs make sense.

I was kinda hoping to make rust be my go to language for general purpose stuff.

[–] [email protected] 6 points 7 months ago* (last edited 7 months ago) (1 children)

Is it possible to become as productive with rust as one is with higher level languages?

Definitely. Rust can be very high level, just as or at least very close to the level of languages like Python. Just as high level as Go or Java I would say.

Rust is quite a wide language, in the sense that it is able to do both low-level and high-level programming. More generally, I'd say the distinction between "high level" and "low level" is just a matter of how much abstraction you use and you can build very powerful abstractions in Rust, and people have done this in many crates.

For instance while using the axum web server framework, you might see code similar to this:

async fn new(db: State<PgPool>, value: Json<MyType>) -> StatusCode {
    ...
}

This function defines a HTTP endpoint that takes a JSON body that deserializes to MyType and internally accesses a PostgreSQL database (for example to save the value in the database) and returns an empty body with a given status code. This is a very "dense" definition in the sense that a lot of semantics is pressed into so few lines of code. This is all possible due to the abstractions built by the axum web framework, and you see similar things in many other crates.

I was kinda hoping to make rust be my go to language for general purpose stuff.

It is a great language for all kinds of stuff - I use it as my general purpose language all the time :)

I would encourage you to stick to it and see how the language feels once you try it a bit more in a real project, and not just in exercises. The book can be a little dry because it has to explain both the high level and low level features of Rust.

If you have more specific questions about what exactly makes it awkward, maybe I can provide more guidance.

[–] [email protected] 2 points 7 months ago (1 children)

General Purpose Language ...

Interesting @[email protected] , I don't think I'd personally every thought of having rust as a go-to general purpose language instead of something higher level like python etc. I'd always presumed it'd be the tool for higher performance tasks/libraries when desired (where the enforcement of safety and the "work" necessary to "pass the compiler" and the type system are desirable features are, to me, desirable features).

But so far, I've seen enough to appreciate how rust, once you know it and the packages/crates and std library, can feel kinda high level ... so an interesting prospect to watch out for.

Exercises/Challenges to supplement "The Book" ...

My personal experience with the book hasn't been fantastic. It's too dry and "academic" IMO and could really do with a more hands on hacking stuff together complementary book. Which is odd because it opens with a good example of just diving in and writing a small program.

So yea, I'd echo SorteKanin's suggestion of writing programs as an exercise. I've tried to promote the idea in this community by posing challenges. I've got a "portal post" for them (only 1 so far) here: https://lemmy.ml/post/13418981. The first was writing a "file differ". I did it (you'll see the solution linked in the portal) ... it took longer than I'd preferred but definitely made me feel like I was learning way more than with the book.

I have a strong feeling that doing more of that sort of thing, including Advent of Code, maybe euler problems (are they still cool?) or whatever else people are interested in (I'd like to have a shot at hacking together a basic web app) ... would be really helpful here (thoughts?).


Tips on Rust as a General Purpose Langauge?

@[email protected] ... do you have any thoughts to share on how Rust is best approached as a higher level / general purpose language? We discussed elsewhere in this comments section the idea that "over-optimising" memory management is unnecessary ... that outside of hot loops you should just happily clone variables to make the borrow checker happy. Any other tricks or approaches you can think of?

Does unsafe mode become a tool worth using, or is it not worth the hassle? What about the Reference Counted pointer (smart pointer?) or Arc<Mutex>s (which I haven't come to learn/understand) for handling memory management?

Are there good and nice crates or standard library tools that aren't the most efficient or idiomatic but good for just getting stuff done?

Also, thanks for answering all of these questions!

[–] [email protected] 3 points 7 months ago (1 children)

do you have any thoughts to share on how Rust is best approached as a higher level / general purpose language? We discussed elsewhere in this comments section the idea that “over-optimising” memory management is unnecessary … that outside of hot loops you should just happily clone variables to make the borrow checker happy. Any other tricks or approaches you can think of?

I would say make good use of the crates available. As said above, Rust as a language allows for very powerful abstractions. The language itself is quite low level, if you didn't have any other code to use. But the standard library alone gives you a lot of tools. Add to that a host of practical crates that raise the abstraction level and you've got a very high-level experience.

I can't not share this meme btw. It's obviously a joke but every joke has a bit of truth:

For instance, you've got anyhow that allows for a bit of "quick and dirty" error handling via the ? operator. It's useful for cases where you don't care too much about what error happens, just that it happens. And when you get an error, you get a nice explanation of what happened (no 100 line exception stack trace with 1 needle in the haystack that maybe tells you what went wrong).

There's also serde (by the same author even) that allows for very easy serialization and deserialization, for instance into JSON or other formats.

You could take a look at lib.rs to find more crates within a lot of different areas. It's a bit more categorized and better sorted than crates.io. Learning the ecosystem of crates and what's good for what is kind of a secondary thing to learn for Rust (or any language with a package ecosystem really) and it will take some trial and error probably.

Does unsafe mode become a tool worth using, or is it not worth the hassle?

No, unless you seriously need it and you definitely know what you're doing, you don't need it and you shouldn't need it. If you're reading stuff on this community, you don't need unsafe. I've worked with Rust for many years and I've still only used unsafe in the areas where it's really needed, like FFI (calling into C or C++ code). Perhaps with embedded programming you need it more, but most people don't do much embedded.

What about the Reference Counted pointer (smart pointer?) or Arcs (which I haven’t come to learn/understand) for handling memory management?

You should definitely learn about Arc (Atomic Reference Counted pointer) and Mutex (mutual exclusion lock) - I believe the book has a chapter or something about them. They provide one way to achieve "shared" ownership across different threads and you'll probably have lots of headaches if you don't know about them.

Are there good and nice crates or standard library tools that aren’t the most efficient or idiomatic but good for just getting stuff done?

There's some I mentioned above and for most major use cases, there are crates. Like axum for web servers as I mentioned above. Look at lib.rs I would say. It really depends on what specifically you want to do though.

[–] [email protected] 2 points 7 months ago

Thanks!

And great little meme there!

[–] [email protected] 4 points 7 months ago (1 children)

I started learning rust yesterday, no specific questions, but the language feels weird to me. (background in java, c++, python)

do you have sources i could read up on to get a thorough understanding. Books, news letters, websites etc?

[–] [email protected] 4 points 7 months ago

C++ knowledge should make it easy to understand Rust. In a lot of ways, the Rust compiler is just enforcing all the stuff you need to do in C++ anyways in order to avoid undefined behaviour.

I think The Book is fine, that's what I used to learn. You can also learn quite a bit by just reading the standard library documentation, which is very nicely structured.

[–] [email protected] 4 points 7 months ago* (last edited 7 months ago) (1 children)

So far, I’m not sure I’m satisfied with whatever I've read about the borrow checker.

Do you have any sort of synthesis of the core ideas or principles or best practices that work for you?

Personally I’m partial to core or essential principles that seem esoteric at first glance but become clearer with more practice and experience.

Otherwise, are there any pain points in getting better at rust that you think are worth just sort of not caring too much about and taking a short cut around until you get better? (Eg: just clone, don’t feel bad about it)


Otherwise, thanks for this! Feel free to subscribe and chime in whenever people have questions.

[–] [email protected] 5 points 7 months ago* (last edited 7 months ago) (2 children)

Do you have any sort of synthesis of the core ideas or principles or best practices that work for you?

I think it's hard to give some sort of master theorem because it really comes down to what you are doing.

That said, start by considering ownership. As you start, avoid creating types with non-static references inside of them i.e. keep to types that own their contents. Types with references requires you to use lifetimes and you'll be better off getting used to that later. And frankly, it's often not necessary. The vast majority of the types you'll ever make will fully own their contents.

Now when you make functions or methods, think of the three options you can use when taking parameters:

  1. Take ownership, i.e. MyType or self for methods. Use this if your function needs to own the data, often because it needs to take that data and put it into another value or otherwise consume it. Honestly this is the least common option! It's quite rare that you need to consume data and not let it be available to anyone else after the function call.
  2. Take a shared reference i.e. &MyType or &self for methods. Use this is your function only needs read access to the data. This is probably the most common case. For instance, say you need to parse some text into some structured type; you'd use &str because you just need to read the text, not modify it.
  3. Take a unique reference, i.e. &mut MyType or &mut self. You'll need this if you want to refer to some data that is owned elsewhere but that you need temporary exclusive (unique) access to, so that you can modify it. This is often used in methods to be able to modify private fields of self for example. You need to think about the fact that no one else can have a reference to the data at the same time though. Often this is not a problem, but sometimes you need to be able to mutate stuff from multiple different places. For that, you can consider either passing ownership around, for instance via channels and sending messages or you could reach for Arc<Mutex<T>> to allow mutation through shared references with a tiny bit of runtime performance cost.

When you think in terms of ownership, the borrow checker becomes easy to understand. All it's doing is checking who owns what and who is borrowing what owned data from who and are they giving it back when they said they would?

I hope that helps but again, it's a very general question. If you give me a concrete case I could also give more concrete advice.

PS: Abso-fucking-lutely just clone and don't feel bad about it. Cloning is fine if you're not doing it in a hot loop or something. It's not a big deal. The only thing you need to consider is whether cloning is correct - i.e. is it okay for the original and the clone to diverge in the future and not be equal any more? Is it okay for there to be two of this value? If yes, then it's fine.

[–] [email protected] 3 points 7 months ago

This is actually a decent synthesis. I personally didn’t learn anything from it per se (not a criticism), but this sort of break down has been lacking from whatever I’ve consumed so far (mostly the Brown university version of the book) and I think it’s good and helpful.

So far I’ve found the book (out the brown University version, because there are differences AFAICT) to be too front loaded on ownership.

[–] [email protected] 2 points 7 months ago (1 children)

PS: Abso-fucking-lutely just clone and don’t feel bad about it. Cloning is fine if you’re not doing it in a hot loop or something. It’s not a big deal. The only thing you need to consider is whether cloning is correct - i.e. is it okay for the original and the clone to diverge in the future and not be equal any more? Is it okay for there to be two of this value? If yes, then it’s fine.

Nice!

I haven’t used clippy (just rust analyser so far, and the compiler of course) … but I wonder if it’d be nice to have some static analysis that gives some hints about how costly a clone is likely to be, just so you could have some confidence about not cloning where it will actually hurt.

Also, thanks for the reply!

[–] [email protected] 3 points 7 months ago (1 children)

I would recommend changing rust-analyzer "check command" setting from "check" to "clippy", then you'll see clippy hints in your editor.

[–] [email protected] 2 points 7 months ago (1 children)

Cheers! So clippy is worth it then?

[–] [email protected] 2 points 7 months ago (1 children)

Oh definitely, it'll point out if you can do something simpler. The default are fine as well. Honestly don't know why it's not just the default to use it.

[–] [email protected] 2 points 7 months ago

Cheers! Will do!

[–] [email protected] 4 points 7 months ago* (last edited 7 months ago) (1 children)

I think its quite difficult to write Rust. I can get something working but any refactoring is a pain, and I usually get issues with the borrow checker as soon as I move things around.

And it's complicated to get the lifetimes correct too. I feel like it's just a lot of effort, and not very fun to put so much time into figuring out Rust rather than what my program should do.

But if I stick with it, it will probably become second nature. It's just a very annoying language sometimes because of the mental gymnastics.

I switched to go and produced two full programs in a few weekends. It's just so much faster to write. Of course I could have bugs in those programs that Rust wouldn't allow. So I see the upside of Rust but it's just hard.

[–] [email protected] 4 points 7 months ago* (last edited 7 months ago)

I usually get issues with the borrow checker as soon as I move things around.

Once you get familiar with thinking in terms of the borrow checker (i.e. thinking in terms of how data can be safely accessed essentially), you'll be more at ease and you'll start building stuff from the start in ways that anticipate many of the issues you might run into. That's my experience at least.

You "just" need to consider how you structure and access the data in your program. If you've used languages with garbage collectors a lot before, you're not used to thinking like that because the garbage collector just accepts whatever structure you give it and says "well I guess I'll have to make it work somehow, someway (with a lot of effort)".

You'll find that once you structure your program in a way that the Rust compiler likes, it also becomes a lot easier to reason about it in general. A garbage collector won't "force" you to structure your program well in this way, which is why that kind of memory management often becomes messy when it scales to more than a few thousand lines of code.

Just having the right editor setup and such can also help a lot for productivity.

it’s complicated to get the lifetimes correct too

Lifetimes is an advanced topic that you mostly don't need when you're starting out. It can often be avoided by sacrificing some small performance, for instance by cloning data. But lifetimes are also a very cool feature, it just takes a bit to understand it well. Once you get it, it's not so bad though.

If you have any more specific questions about what you were trying to build or what errors you ran into, feel free to ask :)

[–] [email protected] 3 points 7 months ago (1 children)

I am an embedded C dev. I want to do embedded with rust just to learn, and be ready in case a client wants that for whatever reason (I just want to be hip).

Do you have any experience with embedded Rust in mcu? If so, how is the workflow?

Otherwise, what IDE would you recommend for Rust?

[–] [email protected] 5 points 7 months ago (1 children)

Unfortunately embedded programming is not something I have first hand experience with, but I've read how it works.

The Rust standard library is actually split into 3 layers:

  1. Core: This is core stuff built into the language. No dependencies needed. No memory allocation, no syscalls, no nothing basically. This defines the primitive types like i32 and str and pure functions on those.
  2. Alloc: Provides heap memory allocation on top of core, giving access to types like Vec and String. Still no operating system so no file IO or anything like that.
  3. Std: The full standard library that generally assumes the presence of an operating system with file systems and networking and all that jazz.

You probably want to check out the Rust embedded book.

The language is obviously still the same even if not using the full standard library. Feel free to ask more specific stuff, though I'm not very familiar with embedded.

[–] [email protected] 2 points 7 months ago (1 children)

Thanks for the answer.

Since I am more interested in the embedded side of Rust (though I don't mind the software side of it), do you think I am better off starting directly with embedded Rust? Since the std lib is not available for embedded.

[–] [email protected] 2 points 7 months ago

I think you should go for what interests you, that will be the best way to learn since you'll be motivated. If you mostly intend to do embedded anyway, learning anything else could just be a waste of time. I guess it'll just be easier since you won't have to learn the standard library :P

[–] [email protected] 3 points 7 months ago (1 children)

I've been wanting to write rust for quite some time, but I can't get over crates. The system just seems insecure to me. What happens in 10 years when the servers go down? Is there any sort of mitigation for supply chain attacks? As I understand it anyone can submit code; what's stopping someone from putting malicious code into a crate I've been using?

I suppose these are risks for any third party package system though.

I've used Flutter infrequently and have experienced things like this with their package system.

[–] [email protected] 3 points 7 months ago (1 children)

I’ve been wanting to write rust for quite some time, but I can’t get over crates. The system just seems insecure to me.

You're not the only one with this concern but it is essentially how modern package management works, not just for Rust but all modern programming languages.

What happens in 10 years when the servers go down?

While I don't think that would happen, there are ways to avoid this. You can host your own registry and mirror the crates.io crates, if you want.

Is there any sort of mitigation for supply chain attacks?

Whenever you have dependencies, you obviously need to either trust them or vet them. If the package is popular enough and the author is reliable enough, then you can choose to trust it. It really depends on what kind of risk you're willing to take on.

As I understand it anyone can submit code; what’s stopping someone from putting malicious code into a crate I’ve been using?

In principal nothing. Again, if you have dependencies, you need to vet them. This isn't really a Rust problem, it's just a general problem with depending on other people's code. You would still have this problem even if you manually downloaded external pieces of code from other people instead of via cargo.

In practice, there is a team managing crates.io and I believe they do look for malware or malicious crates (like crates with names very similar to popular crates that attempt to trick people into downloading due to a typo in the name).

But yes, this isn't really a problem with Rust specifically. I will say that the popular crates in the Rust ecosystem are generally very high quality and I have a fair bit of trust for them myself. Unless you are a big company that needs to carefully vet your dependencies, I wouldn't worry too much.

[–] [email protected] 3 points 7 months ago

Thanks for your detailed input, I'm glad to hear that there is a team that does look out for things at crates.io, and that I can host my own registry.

[–] [email protected] 2 points 7 months ago (1 children)

Hi,

Learning Rust and getting caught up in details, but I always want to know the whys of things, and the minor differences.

Lets start of with, is there a difference between the const value vs const reference?

// Given this little struct of mine, a Page with information about an endpoint
#[derive(Clone)]
pub struct Page<'a> {
    pub title: &'a str,
    pub endpoint: &'a str,
}

// Value
const ROOT_PAGE: Page = Page::new("Home", "/home");

// Reference
const ROOT_PAGE: &'static Page = &Page::new("Home", "/home");
  1. Since my functions always take a reference, is there any advantage to any of them. References are read-only, but since it's const it probably doesn't matter. What is prefered?

  2. I know String does allocations, while &str is a string slice or something which may be on the stack. Do I not end up making any allocations in this case since stucts are on the stack by default, and only hold the pointers to a string "slice". Especially given how they are made in this case.

  3. Is structs with references like this okay, this Page is constant but I'm going to make many "Pages" later based on the pages my blog has, as well as some other endpoints of course.

  4. If the struct is cloned, is the entire string as well, or just the pointer to the string slice? I assume it does copy the entire string, since to the best of my knowledge a &str does not do any reference counting, so deleting a &str means deleting it's content, not just the reference. Since that is the case, a clone will also copy the string.

  5. I am contemplating adding a "body" string to my Page struct. These string will of course be large and vary depending on the page. Naturally, I do not want to end up copying the entire body string every time the Page is cloned. What is the best course here, it kind of depends on the previous question, but is an Arc the solution here? There is only reading to be done from them, so I do not need to worry about any owner holding it.

[–] [email protected] 2 points 7 months ago* (last edited 7 months ago) (1 children)

Lets start of with, is there a difference between the const value vs const reference?

No. Not any practical difference at least. AFAIK behind the scenes, the const reference is just a const value that Rust automatically creates a reference to. So it's just a matter of preference or what makes sense for your case.

  1. Since my functions always take a reference, is there any advantage to any of them. References are read-only, but since it’s const it probably doesn’t matter. What is prefered?

I think I need to see an example of the function (or at least the signature) to give any concrete advice. It really depends on what you're doing and what the function is.

  1. I know String does allocations, while &str is a string slice or something which may be on the stack. Do I not end up making any allocations in this case since stucts are on the stack by default, and only hold the pointers to a string “slice”. Especially given how they are made in this case.

Okay so there's multiple things to unpack here. First of all, &str is a string slice, as you say. It is a "fat pointer", as all slices are, with 2 components: A pointer to the data and a length.

However, &str does not say anything about whether or not the data that it points to or even the &str itself (pointer + length) is on the stack or not. The pointer inside the &str may point anywhere and there's no guarantee that even the &str itself is on the stack (e.g. a Box<&str> is a &str on the heap and the pointer inside that &str could point anywhere).

When you write a string literal like "Hello world!", it is a &'static str slice, where the &str itself (pointer + length) exists on the stack and the data that it points to is memory inside your final executable binary. But you can get &str's from elsewhere, like a &str that refers to memory on the heap that doesn't have a 'static lifetime.

So when you write a string literal like you are in your consts there, you are not allocating any memory on the heap (in fact it is impossible to allocate memory on the heap in a const context). All you're doing is adding a bit of string data to your final executable and that's what the &str points to. To allocate, you would need to use a String.

  1. Is structs with references like this okay, this Page is constant but I’m going to make many “Pages” later based on the pages my blog has, as well as some other endpoints of course.

If all of your Pages are constant and you will write them out like this with literal strings, it's not like there is any "problem". However, you'll only be able to make Pages that can be defined at compile-time. Maybe that's fine if all your pages are pre-defined ahead of time and doesn't use any dynamic values or templating? If so, you can keep it like this in principle. You could even write the pages in a separate file and use the include_str! macro to import them as a &'static str as if you had written it out directly in your code.

If you need just some Pages to have dynamic content, then you'll need to either use a String (which would make all of them heap-allocated) or you could use a Cow (clone-on-write pointer) which is an enum that can either be borrowed data like a &'static str or owned data like String. Using Cow you could allow the static pages to not allocate while the dynamic ones do.

  1. If the struct is cloned, is the entire string as well, or just the pointer to the string slice? I assume it does copy the entire string, since to the best of my knowledge a &str does not do any reference counting, so deleting a &str means deleting it’s content, not just the reference. Since that is the case, a clone will also copy the string.

No, cloning a &str does not clone the underlying text data. You can see this via a simple program:

fn main() {
    let s1 = "Example text";
    let s2 = s1.clone();
    
    println!("{s1:p}");
    println!("{s2:p}");
}

The :p format specifier is the "pointer" specifier and will print the pointer of the string slice. If you run this, you'll see that it prints the exact same pointer twice - i.e. both s1 and s2 are pointing to the exact same data (which only exists once).

No reference counting is needed. Remember, Rust keeps track of reference lifetimes at compile-time. If you clone a &'a str then the clone is also a &'a str. This is fine, it's just another reference to the same string data and of course it has the same lifetime because it will be valid for just as long. Of course it's valid for just as long, it's pointing to the same data after all.

Note that mutable references cannot be cloned, as that would allow multiple mutable references to the same data and that's not allowed.

Note that dropping ("deleting" is not a term used in Rust) a &str does not free the underlying memory. It is just a reference after all, it doesn't own the memory underneath.

  1. I am contemplating adding a “body” string to my Page struct. These string will of course be large and vary depending on the page. Naturally, I do not want to end up copying the entire body string every time the Page is cloned. What is the best course here, it kind of depends on the previous question, but is an Arc the solution here? There is only reading to be done from them, so I do not need to worry about any owner holding it.

If the body is a &str, then it will only clone the reference and the string data itself will not be copied, so this shouldn't be an issue. But remember as I said above that this might only be true if your pages are indeed defined ahead-of-time and don't need dynamic memory allocation.

I can give more concrete advice if you give more code and explain your use case more thoroughly (see also XY problem).

[–] [email protected] 2 points 7 months ago* (last edited 7 months ago) (1 children)

Thanks for the great reply! (And sorry for that other complicated question... )

Knowing that &str is just a reference, makes sense when they are limited to compile time. The compiler naturally knows in that case when it's no longer used and can drop the string at the appropriate time. Or never dropped in my case, since it's const.

Since I'm reading files to serve webpages, I will need Strings. I just didn't get far enough to learn that yet.... and with that 'Cow' might be a good solution to having both. Just for a bit of extra performance when some const pages are used a lot.

For example code, here's a function. Simply take a page, and constructs html from a template, where my endpoint is used in it.

pub fn get_full_page(&self, page: &Page) -> String {
        self.handler
            .render(
                PageType::Root.as_str(),
                &json!({"content-target": &page.endpoint}),
            )
            .unwrap_or_else(|err| err.to_string())
    }

Extra redundant context: All this is part of a blog I'm making from scratch. For fun and learning Rust, and Htmx on the browser side. It's been fun finding out how to lazy load images, my site is essentially a single paged application until you use "back" or refresh the page. The main content part of the page is just replaced when you click a "link". So the above function is a "full serve" of my page. Partial serving isn't implemented using the Page structs yet. It just servers files at the moment. When the body is included, which would be the case for partial serves i'll run into that &str issue.

[–] [email protected] 1 points 7 months ago

Cool, sounds like you have a lot of fun learning :)

[–] [email protected] 2 points 7 months ago (1 children)

Sorry, but a long and slightly complicated question, for a hypotetical case.

I wanted to serve pages in my blog. The blog doesn't actually exist yet (but works locally, need to find out how I can safely host it later...), but lets assume it becomes viral, and by viral i mean the entire internet has decided to use it. And they are all crazy picky about loading times....

I haven't figued out the structure of the Page objects yet, but for the question they can be like the last question:

#[derive(Clone)]
pub struct Page<'a> {
    pub title: &'a str,
    pub endpoint: &'a str,
}

I wanted to create a HashMap that held all my pages, and when I updated a source file, the a thread would replace that page in the mapping. It's rather trivial of a problem really. I didnt find out if I could update a mapping from a thread, so I decided to make each value something that could hould a page and have the page object replaced on demand. It made somewhat sense since I don't need to delete a page.

There is a trivial solution. And it's just to have each HashMap value be a RwLock with an Arc holding my large string. No lagre string copies, Arc make it shared, and RwLock is fine since any number of readers can exist. Only when writing is the readers locked. Good enough really.

But I heard about DoubleBuffers, and though, why can't I have a AtomicPointer to my data that always exist? Some work later and I had something holding an AtomicPointer with a reference to an Arc with my Page type. But it didn't work. It actually failed rather confusingly. It crashed as I was trying to read the title on my Page object after getting it from the Arc. It wasn't even any thread stuff going on, reading once works, the next time it crashed.

struct SharedPointer<T> {
    data: AtomicPtr<Arc<T>>,
}

impl<T> SharedPointer<T> {
    pub fn new(initial_value: T) -> SharedPointer<T> {
        SharedPointer {
            data: AtomicPtr::new(&mut Arc::new(initial_value)),
        }
    }

    pub fn read(&self) -> Arc<T> {
        unsafe { self.data.load(Relaxed).read_unaligned() }.clone()
    }

    pub fn swap(&self, new_value: T) {
        self.data.store(&mut Arc::new(new_value), Relaxed)
    }
}

#[test]
pub fn test_swapping_works_2() {
    let page2: Page = Page::new("test2", "/test2");
    let page: Page = Page::new("test", "/test");
    let entry: SharedPointer<Page> = SharedPointer::new(page.clone());

    let mut value = entry.read();

    assert_eq!(value.title, page.title);
    value = entry.read();
    assert_eq!(value.title, page.title);

    entry.swap(page2.clone());

    let value2 = entry.read();
    assert_eq!(value2.title, page2.title);
    assert_eq!(value.title, page.title);
}

This has undefined behavior, which isn't too surprising since I don't understand pointers that much... and I'm actually calling unsafe code. I have heard it can produce unexpected error outside it's block. I'm just surprised it works a little. This code sometimes fails the second assert with an empty string, crashes with access violation, or one time it gave me a comparison where some of it was lots of question marks! My best understanding is that my Page or it's content is moved or deallocated, but odd that my Arc seems perfectly fine. I just don't see the connection between the pointer and Arcs content causing a crash.

I may just be doing the entire thing wrong, so sticking with RwLock is much better and safer since there is no unsafe code. But I seek to know why this is so bad in the first place. What is wrong here, and is there a remedy? Or is it just fundamentally wrong?

[–] [email protected] 2 points 7 months ago (1 children)

I wanted to serve pages in my blog. The blog doesn’t actually exist yet (but works locally, need to find out how I can safely host it later…), but lets assume it becomes viral, and by viral i mean the entire internet has decided to use it. And they are all crazy picky about loading times…

Of course it depends if doing this kind of optimization work is your goal but... if you just want a blog and you want it to be fast (even with many visitors, but perhaps not the entire internet...), I would say make a static web server that just serves the blog pages directly from &'static strs and predefine all blog posts ahead of time. For example, you could write all your blog posts in HTML in separate files and include them into your code at compile time.

You'd need to recompile your code with new blog post entries in order to update your blog... but like how often are you gonna add to your blog? Recompiling and redeploying the blog server wouldn't be an issue I imagine. That's how I would do it if I wanted a fast and simple blog.

Also general software development wisdom says "don't code for the future" aka YAGNI - you aren't gonna need it. I mean, sorry, but chances are the whole internet will not be crazy about visiting your blog so probably don't worry about it that much 😅. But it is a good learning thing to consider I guess.

#[derive(Clone)]
pub struct Page<'a> {
   pub title: &'a str,
   pub endpoint: &'a str,
}

I'm a little confused about the use of the word "endpoint" here - that usually indicates an API endpoint to me but I would think it would be the post contents instead? But maybe I'm just too hung up on the word choice.

I wanted to create a HashMap that held all my pages, and when I updated a source file, the a thread would replace that page in the mapping.

To me, this sounds like you want to dynamically (i.e. at runtime, while the server is running) keep track of which blog entry files exist and keep a shared hashmap of all the blog files.

So there's multiple things with that:

  1. You'd need to dynamically allocate the storage for the files on the heap as you load them in memory, so they'd need to be String or an Arc<str> if you only need to load it in once and not change it. Since you don't know at compile-time how big the blog posts are.
  2. As you note, you'd need a way to share read-only references to the hashmap while also providing a way to add/remove entries to it at runtime. This requires some kind of lock-syncing like Mutex or RwLock, yes.

why can’t I have a AtomicPointer to my data that always exist?

Does it always exist though? The way you talk about it now sounds like it's loaded at runtime, so it may or may not exist. I think I'd need to see more concrete code to know.

and I’m actually calling unsafe code. I have heard it can produce unexpected error outside it’s block.

Yes, indeed. Safe code must never produce undefined behaviour, but safe code assumes that all unsafe blocks does the correct thing. For instance, safe code will always assume a &str contains UTF-8 encoded data but some unsafe code may have earlier changed the data inside of it to be some random data. That will break the safe code that makes the assumption! But it's not the safe's code fault.

Unsafe in general is a very sharp tool and you should be careful. In the best case, your program crashes. In worse cases, your program continues with garbage data and slowly corrupts more and more. In the even worse case, your program almost always works but rarely produces undefined behaviour that is extremely hard to track down. You could also accidentally introduce security vulnerabilities even if your code works correctly most of the time.

In general, I would advise you to avoid unsafe like the plague unless you really need it. A hypothetical optimization is certainly not such a case. If you really want to use unsafe, you definitely need to carefully peruse the Rustonomicon first.

In your specific case, the problem is (of course) with the unsafe block:

unsafe { self.data.load(Relaxed).read_unaligned() }.clone()

So what is this doing? Well self.data.load(Relaxed) returns a *mut Arc<T> but it is only using safe code so the problem must be with the read_unaligned call. This makes sense, obtaining a raw pointer is fine, it's only using it that may be unsafe.

If we check the docs for the read_unaligned function, it says:

Reads the value from self without moving it. This leaves the memory in self unchanged.

Here "self" is referring to the *mut Arc<T> pointer. So this says that it reads the Arc<T> directly from the memory pointed to by the pointer.

Why is this a problem? It's a problem because Arc<T> is a reference-counted pointer, but you've just made one without increasing the reference count! So the Arc believes there are n references but in fact there are n + 1 references! This is bad! Once the Arc is dropped, it will decrease the reference count by 1. If the reference count is 0, it will drop the underlying data (the T).

So let's say you get into this situation with 2 Arcs but actually the reference count is 1. The first one will drop and will try to free the memory since the reference count is now 0. The second one will drop at some later time and try to update the reference count but it's writing into memory that has been freed so it will probably get a segmentation fault. If it doesn't get the segfault, it will get a problem once it tries to free the memory since it's already been free. Double free is bad!

So yea that's why it probably works once (first arc gets dropped) but not twice (second arc gets a bad experience).

[–] [email protected] 2 points 7 months ago* (last edited 7 months ago)

Ah, so I'm actually cheating with the pointer reading, i'm actually making a clone of Arc without using the clone()... And then dropping it to kill the data. I had assumed it just gave me that object so I could use it. I saw other double buffer implementations (aka write one place, read from another palce, and then swap them safely) use arrays with double values, but I wasn't much of a fan of that. There is some other ideas of lock free swapping, using index and options, but it seemed less clean. So RwLock is simplest.

And yeah, if I wanted a simple blog, single files or const strings would do. But that is boring! I mentioned in the other reply, but it's purely for fun and learning. And then it needs all the bells and whistles. Writing html is awful, so I write markdown files and use a crate to convert it to html, and along the way replace image links with lazy loading versions that don't load until scrolled down to. Why, because I can! Now it just loads from files but if I bother later i'll cache them in memory and add file watching to replace the cached version. Aka an idea of the issue here.

[–] [email protected] 1 points 7 months ago (1 children)

Great thread! Just subscribed to this c/

Sorry I know its a few days old now, but I thought I'd just chime in and ask my question.

First and foremost I'm a self taught web Dev(TypeScript, NodeJS, HTML, CSS), who has also done some small bit of learning C (built a basic UNIX shell and rebuilt some of the ls command in C) and done some shell scripting.

I'm about half way through the Book and am also following along with a 9 hour long intro Video course from Free Code Camp where the instructor generally just has you go through practice.rs

I'm mainly interested in using Rust as my go to back end language for HTTP/TCP servers and developing JSON and HTML APIs. Can you tell me which frameworks and crates/packages would be good for me to be aware of?

I'm also interested in creating some CLI and TUI applications, so any frameworks/crates/packages I should be aware of in that realm you might recommend would also be greatly appreciated!

Thanks so much. I got some great insights just by perusing this thread thus far!

[–] [email protected] 1 points 7 months ago (1 children)

I’m mainly interested in using Rust as my go to back end language for HTTP/TCP servers and developing JSON and HTML APIs. Can you tell me which frameworks and crates/packages would be good for me to be aware of?

Look into axum. It's built on top of tower and tower-http, which is a general server/client framework. Axum makes it super easy to make HTTP servers, using either raw HTML or JSON. It has in-built functionality for JSON and if you want to do HTML you can reach for stuff like maud or askama.

There's lot of crates around axum as well. You can try searching lib.rs which is a bit more nicely categorized than crates.io. For logging for example, look into tracing.

I’m also interested in creating some CLI and TUI applications, so any frameworks/crates/packages I should be aware of in that realm you might recommend would also be greatly appreciated!

For command-line arguments, use the derive functionality from clap. It lets you declare the arguments you want/need as a type and produces all the argument parsing logic for you.

For TUI, I haven't tried it myself but I've heard that ratatui is good.

[–] [email protected] 1 points 7 months ago (1 children)

Thanks so much! I'll be bookmarking these and checking them out as I go along.

Lastly, I just wanted to ask a couple more questions if that's OK.

How long did it take you before you started to become proficient enough in Rust that you could be productive for your employer? Were you already proficient in other systems level programming languages like C or C++ before learning Rust?

Did you get hired as a Rust developer or were you working or your current employer utilizing another programming language and you eventually move to developing in Rust?

Do you see there being more jobs utilizing Rust in the future?

I know that's a lot, so if you don't want to field all of those, I understand, but I'm very curious so I thought I'd just put those out there.

Thanks again!

[–] [email protected] 1 points 7 months ago* (last edited 7 months ago) (1 children)

How long did it take you before you started to become proficient enough in Rust that you could be productive for your employer?

Not too long, around 3 months maybe. But it depends how much time you spend obviously. Learning the language is fairly quick. Learning the more exotic parts of the language took a bit longer but that's mostly cause I didn't need those things until later. Learning the package ecosystem is also something that can take a bit of research and you kinda have to just keep yourself up to date about cool crates via blog posts and sharing on online communities like this. But all this will probably depend on your prior expertise. I have a master's in computer science so it wasn't a huge deal for me.

Were you already proficient in other systems level programming languages like C or C++ before learning Rust?

I was... okay at C++ before-hand so kinda. But honestly C++ is such a shitty language that looking back I barely had any grasp at that time honestly. With Rust, it's so much easier and I understand how the system works so much better now, simply because Rust forces me to understand it. The compiler is a great teacher!

Did you get hired as a Rust developer or were you working or your current employer utilizing another programming language and you eventually move to developing in Rust?

I was not hired as a Rust developer. There was actually barely any Rust at the company when I joined. There were a few other colleagues interested in it and when I came in we really went for it. It took some convincing of management and stuff but now we use it in a lot of places and I write almost exclusively Rust at work.

But I think I was very lucky in this aspect. There are few places where you will have such an opportunity to influence the technology in that way.

Do you see there being more jobs utilizing Rust in the future?

110%. Rust is set to replace languages like C and C++, and at the same time it is heavily competing with other programming languages. Even languages like Python. There's a huge opportunity to improve software reliability across the field with Rust.

Rust is supported by the largest tech companies in the world and is getting integrated into Linux. There has not been a language with this level of dedication and support behind it for a long time.

Growth is happening and it will only accelerate in the coming years. You can even see it happening on Google Trends. It's a great time to learn the language to get ahead of the curve!

I know that’s a lot, so if you don’t want to field all of those, I understand, but I’m very curious so I thought I’d just put those out there.

Hey I made this thread to answer questions, thank you for asking! I'm sure there are many lurkers who were also curious.

[–] [email protected] 1 points 7 months ago* (last edited 7 months ago) (1 children)

Thanks so very much. Very informative and encouraging. As a mainly TypeScript developer whose only done some dabbing in C, bash, and python, I've been looking for a language that's a bit more abstracted than C, but not so pigeonholed into specific use cases like Golang (I'm still developing an opinion on Golang, not sure how I feel about it).

Rust so far has appeared like quite a beautiful language and the compiler in particular is the best I've ever seen in terms of helpful error/warning messages!

I'm sure I'll have my small complaints as I struggle to get good at Rust in the near future, but I think this is going to be my go to back end language for some time.

I have plans to eventually convert the C code of the terminal based browser, links, to a Rust project to learn more about how a very basic browser is built. I'd also like to do the same for the TUI system monitoring tool btop, which is written in C++.

I think just attempting those two "rewrite it in Rust" projects, once I have other smaller projects under my belt, will probably give me a good understanding not only of Rust, but also aspects of the HTTP/HTTPS protocols and systems programming not commonly encountered in the field of web development.

Last question, I promise, lol. But what do you make of this plan? Are their any caveats or concerns I should be made aware of in regards to this endeavor?

Again, thanks for everything!

[–] [email protected] 1 points 7 months ago (1 children)

(I’m still developing an opinion on Golang, not sure how I feel about it).

I don't have concrete experience with Go but I've read enough about the language to form an armchair opinion. If you ask me, it seems pretty bad. It's like you just took C and you threw a garbage collector and an async runtime on top and called it a day. No lessons learned from any of the 40 years prior of programming language theory, just C with a garbage collector. I think the only reason anyone is using Go is because it's Google and they pushed it a lot. If someone made Go today and wasn't a billion-dollar corporation and tried to convince people to use it, nobody would use it.

I have plans to eventually convert the C code of the terminal based browser, links, to a Rust project to learn more about how a very basic browser is built.

I usually use reqwest for HTTP request stuff. But if your goal is to learn about more low level stuff, you might want to use a lower level library like hyper or even just only using the stuff in the standard library.

I’d also like to do the same for the TUI system monitoring tool btop, which is written in C++.

I'm a big fan of bottom, which is a TUI resource monitor. Maybe you'll get some inspiration from there.

But what do you make of this plan? Are their any caveats or concerns I should be made aware of in regards to this endeavor?

I can't really think of any problems. I think it sounds like a good idea to build some concrete stuff and see what you run into. Just realize that it might take a while before you get used to writing idiomatic Rust code, so don't expect your first project to be your prettiest work... 😅

[–] [email protected] 2 points 7 months ago* (last edited 6 months ago) (1 children)

Definitely. Okay, that's about all I have to ask now. I'm bookmarking this thread though to refer back to. You've given me some great insights and resources, and have also pointed me in the right direction going forward.

For now I'll be just making my way through the Book. I also have Programming Rust, by O'Reilly, Command Line Rust by O'Reilly, and Rust for Rustaceans to reference along with the plethora of online resources.

I might PM you some time in the future (if that's okay) should I get stuck on something I can't figure out through the usual means (i.e. documentation, stack overflow, etc.).

Again, can't thank you enough for the help. Cheers!

[–] [email protected] 2 points 7 months ago

plethora of online resources

There's also zero2prod.com which is really nice as well. There's even a free sample of the book online.

I might PM you some time in the future (if that’s okay) should I get stuck on something I can’t figure out through the usual means (i.e. documentation, stack overflow, etc.).

Feel free to but also consider just posting a thread here so others can also see and learn 🙂. Just be sure to @ me to make sure I see it.